Tag Archives: big data

How-To : Write a Kafka Producer using Twitter Stream ( Twitter HBC Client)

Twitter opensourced it's  Hosebird client (hbc) , a robust Java HTTP library for consuming Twitter’s Streaming API . In this post, I am going to present a demo of how we can use hbc to create a Kafka twitter stream producer , which tracks few terms on twitter statuses  and produces a kafka stream out of it, which can be … Continue Reading ››

How-To : Setup Realtime Alalytics over Logs with ELK Stack : Elasticsearch, Logstash, Kibana?

{{unknown}}

Hadoop : Getting Started with Pig

What is Apache Pig?

Apache Pig is a high level scripting language that is used with Apache Hadoop. It enables data analysts to write complex data transformations without knowing Java. It’s simple SQL-like scripting language is called Pig Latin, and appeals to developers already familiar with scripting languages and SQL.Pig Scripts are converted into MapReduce Jobs which runs on data … Continue Reading ››

Top 20 Hadoop and Big Data Books

Big Data Books

Hadoop: The Definitive Guide

i Hadoop: The Definitive Guides the ideal … Continue Reading ››

Top 10 Hadoop Shell Commands to manage HDFS

Basically, our goal is to organize the world's information and to make it universally accessible and useful.-Larry Page
So you already know what Hadoop is? Why it is used for ? and  What problems you can solve with it?  and you want to know how you can deal with files on HDFS ?  Don't worry, you are at the right … Continue Reading ››