Tag: java

How to Configure Spark Application ( Scala and Java 8 Version with Maven ) in Eclipse.

saurzcode-spark-eclipseApache Spark is becoming very popular among organization looking to leverage its fast, in-memory computing capability for big-data processing. This article is for beginners to get started with Spark Setup on Eclipse/Scala IDE and  getting familiar with Spark terminologies in general –

Hope you have read previous article on RDD basics , to get a basic understanding of Spark RDD.

Tools Used :

  • Scala IDE for Eclipse – Download latest version of Scala IDE from here .Here, I used Scala IDE 4.7.0 Release, which support both Scala and Java
  • Scala Version – 2.11 ( make sure scala compiler is set to this version as well)
  • Spark Version 2.2 ( provided in maven dependency)
  • Java Version 1.8
  • Maven Version 3.3.9 ( Embedded in Eclipse)
  • winutils.exe


More Effective Java With Joshua Bloch

Many of us already know and agree on how great the book “Effective Java by Joshua Bloch” is and it’s a must read for every Java Developer out there whether you have just started or working for a while.While reading the book and researching on some of the Items listed in the book, I came across this Interview with Joshua Bloch Link at Oracle ,

Effective Java
Effective Java : Joshua Bloch

in which he speaks about some of the great things in the book and shares his knowledge on some great topics in the language.This should be a good read for someone interested to explore more while reading this book or afterwards –

Here is the link –


Also take a looks at –

How-To : Generate Restful API Documentation with Swagger ?

“Any fool can write code that a computer can understand. Good programmers write code that humans can understand.”
– Martin Fowler

What is Swagger ?

Swagger is a specification and complete framework implementation for describing, producing, consuming, and visualizing RESTful web services. The goal of Swagger is to enable client and documentation systems to update at the same pace as the server. The documentation of methods, parameters, and models are tightly integrated into the server code, allowing APIs to always stay in sync.

Why is Swagger useful?

The  framework simultaneously solves server, client, and documentation/sandbox needs.

With Swagger’s declarative resource specification, clients can understand and consume services without knowledge of server implementation or access to the server code. The Swagger UI framework allows both developers and non-developers to interact with the API in a sandbox UI that gives clear insight into how the API responds to parameters and options.

It happily speaks both JSON and XML, with additional formats in the works.

Now let’s see a working example and how do we configure Swagger, to generate API documentation of our sample REST API created using Spring Boot.

How to Enable Swagger in your Spring Boot Web Application ?

If you are one of those lazy people who hates reading the configurations, download the complete working example here , otherwise go on –

Step 1 : Include Swagger-SpringMVC dependency in Maven

Step 2 : Create Swagger Java Configuration

  • Use the @EnableSwagger annotation.
  • Autowire SpringSwaggerConfig.
  • Define one or more SwaggerSpringMvcPlugin instances using springs @Bean annotation.
[gist https://gist.github.com/saurzcode/9dcee7110707ff996784/]

Step 3 : Create Swagger UI using WebJar

For using webjar dependency add, following repository and dependency, which will auto configure swagger UI for you.


That’s it. Now run the Application.java as a java application in your IDE , and you will see the application running in embedded tomcat/jetty server running at default port 8080.

Verify the API Configuration by pointing your browser at  – http://localhost:8080/api-docs

And finally ,you can see the Swagger API Docs  and test the APIs at  http://localhost:8080/index.html

Swagger: API Doc for Spring Boot Application
Swagger : API Doc for RESTful API


Also, please note that default url in webjar files is - http://petstore.swagger.wordnik.com/api/api-docs So you might see an error like this, "Can't read from server. It may not have the appropriate access-control-origin settings." Solution : Just replace the URL [http://petstore.swagger.wordnik.com/api/api-docs] on screen with [http://localhost:8080/api-docs] and you will see UI as above.


Again, complete project is available at GitHub.


References :

Do write back in comments,  if you face any issues or concerns !!

You may also like :-


Top 20 Hadoop and Big Data Books

Big Data Books

Hadoop: The Definitive Guide


Hadoop: The Definitive Guides the ideal guide for anyone who wants to know about the Apache Hadoop  and all that can be done with it.Good book on basics of Hadoop (HDFS, MapReduce & other related technologies). This book provides all necessary details to start work with Hadoop, program using it

“Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk.” — Doug Cutting, Hadoop Founder, Yahoo!

Latest version 4th Edition is available here  – Hadoop – The Definitive Guide 4e


How-To :Become a Hadoop Certified Developer ?

Hadoop Certified


 Apache Hadoop is an open source framework for distributed storing and processing of large sets of data on commodity hardware. Hadoop enables businesses to gain insight from massive amounts of structured and unstructured data quickly.

Hadoop and Big Data are the hot trends of the Industry these days. Most of the companies are already implementing these or they have at least started to show interest to remain competitive in the market. Big Data and Analytic are certainly one of the great concepts for current and forthcoming IT generation as most of the innovation is driven by vast amount of data that is being generated exponentially.

There are many vendors for Enterprise Hadoop in the Industry – Cloudera, HortonWorks (forked out of Yahoo), MapR, IBM are some of the few front runners among them. They all have their own Hadoop Distributions which differs in one way or other in terms of features keeping Hadoop to its core. They provide training on various Hadoop and Big Data technologies and as an Industry trend are coming out to provide certifications around these technologies too.

In this article I am going to list down all the latest available certifications for Hadoop by different vendors in the industry. Certifications are helpful to your career or not , that’s altogether a different debate and out of scope of this article. It may be useful for some of the folks out there who thinks they have done enough reading about it and now they want to judge themselves or those who are looking to add values to their  portfolios.


CCAH (Administrator) Exams

Cloudera Certified Administrator for Apache Hadoop (CCA-410)

There are three versions for this exam currently –

Exam Code: CCA-410
Number of Questions: 60 questions
Time Limit: 90 minutes
Passing Score: 70%
Language: English, Japanese
Price: USD $295

Exam Code: CCA-500
Number of Questions: 60 questions
Time Limit: 90 minutes
Passing Score: 70%
Language: English, Japanese (forthcoming)
Price: USD $295

CCAH CDH 5 Upgrade Exam
Exam Code: CCA-505
Number of Questions: 45 questions
Time Limit: 90 minutes
Passing Score: 70%
Language: English, Japanese (forthcoming)
Price: USD $125

CCAH Practice Test

CCAH Study Guide

CCDH (Developer) Exams

Cloudera Certified Developer for Apache Hadoop (CCD-410)

Exam Code: CCD-410
Number of Questions: 50 – 55 live questions
Time Limit: 90 minutes
Passing Score: 70%
Language: English, Japanese
Price: USD $295

Study Guide :- Available at Cloudera site.

Practice Tests :- Available at Cloudera site.


For 2.x Certifications

1) Hadoop 2.0 Java Developer Certification

This certification is intended for developers who design, develop and architect Hadoop-based solutions written in the Java programming language.

Time Limit  : 90 minutes

Number of Questions : 50

Passing Score : 75%

Price : $150 USD

Practice tests can be taken by registering at certification site.

2) Hadoop 2.0 Developer Certification

The Certified Apache Hadoop 2.0 Developer certification is intended for developers who design, develop and architect Hadoop-based solutions, consultants who create Hadoop project proposals and Hadoop development instructors.

Time Limit  : 90 minutes.

Number of Questions :50

Passing Score : 75%

Price : $150 USD

Practice tests can be taken by registering at certification site.

3) Hortonworks Certified Apache Hadoop 2.0 Administrator

This is intended for administrators who deploy and manage Apache Hadoop 2.0 clusters, teaches students how to install,configure, maintain and scale the Hadoop 2.0 environment.

Time Limit  : 90 minutes.

Number of Questions :48

Passing Score : 75%

Price : $150 USD

For 1.x Certifications

1) Hadoop 1.0 Developer Certification

Time Limit  : 90 minutes.

Number of Questions :53

Passing Score : 75%

Price : $150 USD

2) Hadoop 1.0 Administrator Certification

Time Limit  : 60 minutes.

Number of Questions :41

Passing Score : 75%

Price : $150 USD


Related Articles :


%d bloggers like this: