Tag Archives: hadoop

How-To : Use HCatalog with Pig

 Using HCatalog with Pig :-

This post is a step by step guide on running HCatalog and using HCatalog with Apache Pig :- Assumptions : Pig and Hive are installed and tested with basic modes. It requires Hive Metastore and it's databse to be properly configured ( Refer to Post ) Versions Tested With :-  HCatalog … Continue Reading ››

Hive Strict Mode

Sort By vs Order By vs Group By vs Cluster By in Hive

What is Hive Strict Mode ?

Hive Strict Mode ( hive.mapred.mode=strict) enables hive to restrict certain performance intensive operations. Such as -
  • It restricts queries of partitioned tables without a WHERE clause.

How-To : Configure MySQL Metastore for Hive ?

Hive by default comes with Derby as its metastore storage, which is suited only for testing purposes and in most of the production scenarios it is recommended to use MySQL as a metastore. This is a step by step guide on How to Configure MySQL Metastore for Hive in place of Derby Metastore (Default). Assumptions : Basic knowledge of Unix is … Continue Reading ››

Hadoop : Getting Started with Pig

What is Apache Pig?

Apache Pig is a high level scripting language that is used with Apache Hadoop. It enables data analysts to write complex data transformations without knowing Java. It’s simple SQL-like scripting language is called Pig Latin, and appeals to developers already familiar with scripting languages and SQL.Pig Scripts are converted into MapReduce Jobs which runs on data … Continue Reading ››

Top 20 Hadoop and Big Data Books

Big Data Books

Hadoop: The Definitive Guide

i Hadoop: The Definitive Guides the ideal … Continue Reading ››