HiveServer2 (HS2) is a server interface that enables remote clientsto execute queries against Hive and retrieve the results. The current implementation, based on Thrift RPC, is an improved version of HiveServer and supports multi-client concurrency and authentication. It is designed to provide better support for open API clients like JDBC and ODBC.images

In this post, we will see , how we can start HiveServer2 and connect to it with a JDBC Client :-

Part 1 : How to Start  HiveServer2 ( Hive as a service) :

OR


You should see something like this on console  :-HiveServer2

 

A quick way to check if HiveServer2 is running is to use netstat command to see if port 10000 is open and listening to connections :-

Now we are all set to connect to above started Hive Service and we can connect our JDBC client to the server to create table, write queries over it etc.

 

Part 2 : Using JDBC to Connect to HiveServer2

You can use JDBC to access data stored in a relational database or other tabular format.

  1. Load the HiveServer2 JDBC driver.

    For example:
  2. Connect to the database by creating a Connection object with the JDBC driver.

    For example:
  3. The default <port> is 10000. In non-secure configurations, specify a <user> for the query to run as. The <password> field value is ignored in non-secure mode.

    In Kerberos secure mode, the user information is based on the Kerberos credentials.

  4. Submit SQL to the database by creating a Statement object and using its executeQuery() method.

    For example:

  5. Process the result set, if necessary.

Let’s understand this with an Example :-

We’ll create a text file with test values and read data with Hive and display using queries  –

echo -e '1\x01foo' > /tmp/a.txt
echo -e '2\x01bar' >> /tmp/a.txt

 

Test Java Client :-

We need to add following Maven Dependencies :-

 

And  let’s create one Test Java Program

When we run the Test JDBC Client Output on Console looks like :-


And on the HiveServer2 Screen you should see corresponding output  and Processing of MapReduce Jobs for each query , something like this.HiveServer2QueryOutput

That’s It !! We are all set with a HiveServer2 Running and Successfully connected with a JDBC Client.

Happy Learning !!

References :-

[1] https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients