Category: Technology

May 2, 2021

How to Train and Score Catboost Model on Spark

About CatBoost Catboost (developed by Yandex) is one of the great open-source gradient boosting libraries with great performance without a lot of additional tuning. It provides support for categorical features without any need for encoding...

git / Technology

September 15, 2019

Git – How to Split Subdirectory to Separate Repository

If you regret putting that git sub-directory inside a git repository and thinking about moving it out of the current repository to its own repository, you have come to the right place! In this post,...

Big Data / Scala / Spark / Technology

September 14, 2019

Spark – How to Run Spark Applications on Windows

Whether you want to unit test your Spark Scala application using Scala Tests or want to run some Spark application on Windows, you need to perform a few basics settings and configurations before you do...

Big Data / Scala / Spark

September 14, 2019

What does Skipped Stage means in Spark WebUI ?

Skipped Stages in Spark UI You must have come across various scenarios where you see a DAG like below, where you see a few stages shows greyed out with a text (skipped) after the stage...

Big Data / Scala / Spark / Technology

June 7, 2018

Dataframe Operations in Spark using Scala

Dataframe in Apache Spark is a distributed collection of data, organized in the form of columns. Dataframes can be transformed into various forms using DSL operations defined in Dataframes API, and its various functions. In...

Big Data / Java / Technology

May 27, 2018

How to Use MultiThreadedMapper in MapReduce

In simple MapReduce Job each instance of Mapper.map() method is invoked by a single thread and key value pair are passed serially. MultithreadedMapper class is used instead of default Mapper when tasks are CPU bound...

Category: Technology

How to Train and Score Catboost Model on Spark

Like this:

Git – How to Split Subdirectory to Separate Repository

Like this:

Spark – How to Run Spark Applications on Windows

Like this:

What does Skipped Stage means in Spark WebUI ?

Like this:

Dataframe Operations in Spark using Scala

Like this:

How to Use MultiThreadedMapper in MapReduce

Like this:

Category: Technology

How to Train and Score Catboost Model on Spark

Share this:

Like this:

Git – How to Split Subdirectory to Separate Repository

Share this:

Like this:

Spark – How to Run Spark Applications on Windows

Share this:

Like this:

What does Skipped Stage means in Spark WebUI ?

Share this:

Like this:

Dataframe Operations in Spark using Scala

Share this:

Like this:

How to Use MultiThreadedMapper in MapReduce

Share this:

Like this: