In this post, we will discuss about the setup needed for HBase Integration with Hive and we will test this integration with the creation of some test hbase tables from hive shell and populate the contents of it from another hive table and finally verify these contents in hbase table.
Reasons to use Hive on HBase is that a lot of data sitting in HBase due to its usage in a real-time environment, but never used for analysis as there are less connectivity tools to HBase directly.
We will use storage handler mechanism to create hbase tables via hive. HBaseStorageHandler allows Hive DDL for managing table definitions in both Hive metastore and HBase’s catalog simultaneously and consistently.
Setup for HBase Integration with Hive:
For setting up of HBase Integration with Hive, we mainly require a few jar files to be present in $HIVE_HOME/lib or $HBASE_HOME/lib directory. The required jar files are:
Here $HBASE_HOME/lib directory will contain many hbase-*.jar files and below are the list for Hadoop 2 API.
We need to add the paths for above jar files to value of hive.aux.jars.path property in hive-site.xml configuration file.
Verify HBase Integration with Hive:
Lets create a new hbase table via hive shell. To Test the hbase table creations we need Hadoop, Yarn and Hbase daemons to be running.