Apache HCatalog is a Storage Management Layer for Hadoop that helps to users of different data processing tools in Hadoop ecosystem like Hive, Pig and MapReduce easily read and write data from the cluster.HCatalog enables with relational view of data from RCFile format, Parquet, ORC files, Sequence files stored on HDFS. It also exposes REST API exposed to external systems to access the metadata. (more…)
In Apache Hive, like SQL, you can decide to order or sort your data differently based on ordering and distribution requirement. In this post we will look at how SORT BY, ORDER BY, DISTRIBUTE BY and CLUSTER BY behaves differently in Hive.
Hive uses the columns in SORT BY to sort the rows before feeding the rows to a reducer. The sort order will be dependent on the column types. If the column is of numeric type, then the sort order is also in numeric order. If the column is of string type, then the sort order will be lexicographical order.
Ordering : It orders data at each of ‘N’ reducers , but each reducer can have overlapping ranges of data.
Outcome : N or more sorted files with overlapping ranges. (more…)
HiveServer2 (HS2) is a server interface that enables remote clientsto execute queries against Hive and retrieve the results. The current implementation, based on Thrift RPC, is an improved version of HiveServer and supports multi-client concurrency and authentication. It is designed to provide better support for open API clients like JDBC and ODBC. (more…)
Hive by default comes with Derby as its metastore storage, which is suited only for testing purposes and in most of the production scenarios it is recommended to use MySQL as a metastore. This is a step by step guide on How to Configure MySQL Metastore for Hive in place of Derby Metastore (Default).
Assumptions : Basic knowledge of Unix is assumed and also It’s assumed that Hadoop and Hive configurations are in place.Hive with default metastore Derby is properly configured and tested out.
Install MySQL –
$sudo apt-get install mysql-server
Note: You will be prompted to set a password for root.