Tag Archives: hdfs

Top 15 HDFS Interview Questions

HDFS is the distributed file system used in Hadoop and helps to achieve the purpose of storing very larger files on a commodity Hardware. While working on Hadoop and BigData in general it is very important to understand the basic concepts of they underlying file system, i.e. HDFS in case of Hadoop. When you are appearing in BigData Interviews , … Continue Reading ››

How-To : Integrate Kafka with HDFS using Camus (Twitter Stream Example)


How-to : Write a CoProcessor in HBase

What is Coprocessor in HBase ?

Coprocessor is a mechanism which helps to move computations closer to the data in HBase. It is like a Mapreduce framework to distribute tasks across the cluster.hbase_logo You can think of them like either Aspects in Java  where it intercepts code before and after some critical operations and … Continue Reading ››

How-To : Setup Development Environment for Hadoop MapReduce

This post is intended for folks who are looking out for a quick start on developing a basic Hadoop MapReduce application. We will see how to set up a basic MR application for WordCount using Java, Maven and Eclipse and run a basic MR program in local mode , which is easy for debugging at an early stage. Assuming JDK 1.6+ is … Continue Reading ››

Hadoop : Getting Started with Pig

What is Apache Pig?

Apache Pig is a high level scripting language that is used with Apache Hadoop. It enables data analysts to write complex data transformations without knowing Java. It’s simple SQL-like scripting language is called Pig Latin, and appeals to developers already familiar with scripting languages and SQL.Pig Scripts are converted into MapReduce Jobs which runs on data … Continue Reading ››

Getting Started with Hadoop : Free Online Hadoop Trainings

Hadoop Trainings

Oh Yes! It's Free !!!

With the rising popularity , increase in demand and lack of experts in Big Data and Hadoop technologies, various  paid training courses and certifications are available from various Enterprise Hadoop providers like Cloudera, Hortonworks , IBM, MapR etc .But if you don't want to shell out some money … Continue Reading ››