saurzcode-spark-eclipseApache Spark is becoming very popular among organization looking to leverage its fast, in-memory computing capability for big-data processing. This article is for beginners to get started with Spark Setup on Eclipse/Scala IDE and  getting familiar with Spark terminologies in general –

Hope you have read previous article on RDD basics , to get a basic understanding of Spark RDD.

Tools Used :

  • Scala IDE for Eclipse – Download latest version of Scala IDE from here .Here, I used Scala IDE 4.7.0 Release, which support both Scala and Java
  • Scala Version – 2.11 ( make sure scala compiler is set to this version as well)
  • Spark Version 2.2 ( provided in maven dependency)
  • Java Version 1.8
  • Maven Version 3.3.9 ( Embedded in Eclipse)
  • winutils.exe

(more…)