What is HCatalog ?
Apache HCatalog is a Storage Management Layer for Hadoop that helps to users of different data processing tools in Hadoop ecosystem like Hive, Pig and MapReduce easily read and write data from the cluster.HCatalog enables with relational view of data from RCFile format, Parquet, ORC files, Sequence files stored on HDFS. It also exposes REST API exposed … Continue Reading ››
Using HCatalog with Pig :-
This post is a step by step guide on running HCatalog and using HCatalog with Apache Pig :- Assumptions
Pig and Hive are installed and tested with basic modes.
It requires Hive Metastore and it's databse to be properly configured ( Refer to Post
) Versions Tested With :-
HCatalog … Continue Reading ››