HBase Installation in Pseudo Distribution Mode 4


This post describes the procedure for HBase Installation on Ubuntu Machine in pseudo distributed mode using HDFS configuration.

Prerequisites: 
  • Java is one of the main prerequisite. JDK 1.6 or later versions of Java installation is required to run HBase.
  • Hadoop 1 or Hadoop 2 installed on pseudo distributed or fully distributed cluster.
HBase Installation Procedure:

Follow below steps in the same order to complete the HBase Installation on Ubuntu machine.

  • Download latest stable version of HBase from Apache Download Mirrors. Click on suggested mirror site link and choose latest or stable version to download. Select the .tar.gz file (a binary file) for download.
  • In this post, we have downloaded hbase-0.98.2-hadoop2-bin.tar.gz file from the above site link.
    At the time of writing this post, HBase is available in two flavors for Hadoop 1 and Hadoop 2. If your hadoop release is greater than or equal to hadoop-2.0.2, then hadoop2 file should be downloaded if hadoop release is less than hadoop-2.0.2 (like hadoop-1.2.1) then  hadoop1 file should be downloaded.
  • Copy the downloaded .tar.gz file to our preferred location of installation directory. Usually into /usr/lib/hbase. Unpack the tar.gz file with tar xzf command.

 Below are the list of useful terminal commands on Ubuntu machine to copy the downloaded file and unpacking it and verifying the unpacked file.

HBase installation1

Setting up of Environment Variables:

After unpacking the tar.gz file into our choice of installation directory, we need to setup HBASE_HOME environment variable with HBase installation home directory and PATH environment variable needs to be updated with HBase home directory.

Add below lines of code into .bashrc file in Ubuntu.

 

Configuring HBase Installation:

At this point we have successfully downloaded and copied HBase into installation directory. Before starting hbase, we need to tell hbase about java installation home directory.

  • We need to update JAVA_HOME environment variable in conf/hbase-env.sh file. At the minimum below four configuration properties need to be setup in hbase-env.sh file.

If we are not going to install ZooKeeper coordination service and configure it for HBase now and decided to HBase’s default zookeeper installation instance we must need to set true to HBASE_MANAGES_ZK property.

  • In conf/hbase-site.xml file we need to set below site specific configuration properties at minimum.  We need to set hbase.rootdir, the directory HBase writes data to, and hbase.zookeeper.property.dataDir, the directory ZooKeeper writes its data too.

  •  In conf/regionservers file list the names of all the region servers which are slaves to HBase Master server. In pseudo distributed mode, add entry localhost into regionservers file.
Note:

By default, hbase.rootdir is set to /tmp/hbase-${user.name} and similarly so for the default ZooKeeper data location which means we’ll lose all our data whenever our server reboots unless we change it . Most operating systems clear /tmp on restart.

Verify HBase Installation:

HBase installation can be verified from the terminal with below command

 Successful installation will result in messages as shown in below screen shot.

hbase version

If you receive message similar to above, Congratulations !!! You have successfully installed HBase on HDFS in pseudo distributed mode.


About Siva

Senior Hadoop developer with 4 years of experience in designing and architecture solutions for the Big Data domain and has been involved with several complex engagements. Technical strengths include Hadoop, YARN, Mapreduce, Hive, Sqoop, Flume, Pig, HBase, Phoenix, Oozie, Falcon, Kafka, Storm, Spark, MySQL and Java.


Leave a comment

Your email address will not be published. Required fields are marked *

4 thoughts on “HBase Installation in Pseudo Distribution Mode

  • Vetri

    I am a .Net application developer with 4 years of experience. Can i become a hadoop developer? As of now i have set up hadoop ecosystem and installed Hive and Hbase. Can you suggest what to learn to be successful?


Review Comments
default image

I have attended Siva’s Spark and Scala training. He is good in presentation skills and explaining technical concepts easily to everyone in the group. He is having excellent real time experience and provided enough use cases to understand each concepts. Duration of the course and time management is awesome. Happy that I found a right person on time to learn Spark. Thanks Siva!!!

Dharmeswaran ETL / Hadoop Developer Spark Nov 2016 September 21, 2017

.