How-To : Setup Development Environment for Hadoop MapReduce

This post is intended for folks who are looking out for a quick start on developing a basic Hadoop MapReduce application. We will see how to set up a basic MR application for WordCount using Java, Maven and Eclipse and run a basic MR program in local mode , which is easy for debugging at an early stage. Assuming JDK 1.6+ is … Continue Reading ››

How-To : Use HCatalog with Pig

 Using HCatalog with Pig :-

This post is a step by step guide on running HCatalog and using HCatalog with Apache Pig :- Assumptions : Pig and Hive are installed and tested with basic modes. It requires Hive Metastore and it's databse to be properly configured ( Refer to Post ) Versions Tested With :-¬† HCatalog … Continue Reading ››

Hive Strict Mode

Sort By vs Order By vs Group By vs Cluster By in Hive

What is Hive Strict Mode ?

Hive Strict Mode ( hive.mapred.mode=strict) enables hive to restrict certain performance intensive operations. Such as -
  • It restricts queries of partitioned tables without a WHERE clause.

How-To : Connect HiveServer2 service with JDBC Client ?

HiveServer2 (HS2) is a server interface that enables remote clientsto execute queries against Hive and retrieve the results. The current implementation, based on Thrift RPC, is an improved version of HiveServer and supports multi-client concurrency and authentication. It is designed to provide better support for open API clients like JDBC and ODBC.imagesContinue Reading ››

How-To : Configure MySQL Metastore for Hive ?

Hive by default comes with Derby as its metastore storage, which is suited only for testing purposes and in most of the production scenarios it is recommended to use MySQL as a metastore. This is a step by step guide on How to Configure MySQL Metastore for Hive in place of Derby Metastore (Default). Assumptions : Basic knowledge of Unix is … Continue Reading ››

BigData and Hadoop Tutorial

%d bloggers like this: