Hbase Installation in Fully Distribution Mode 1


This post is a continuation for previous post on Hbase Installation. In the previous we have discussed about Hbase installation in pseudo distribution mode and in this post we will learn how to install and configure Hbase in fully distribution mode.

Prerequisites:
  • JDK 1.6 or later versions of Java installed on each data node machine and Name Node.
  • Hadoop 1 or Hadoop 2 Installed and Configured in fully distribution mode
HBase Installation Procedure:
  • Download latest stable version of HBase from Apache Download Mirrors.
  • Unpack the downloaded tar.gz file with tar xzf command on name node as well as in all data nodes into our preferred Hbase installation directory.
  • In each machine (name node and all data nodes), add the environment variable HBASE_HOME (Hbase installation directory) to .bashrc file and update the PATH environment variable with Hbase bin directory as shown below.

Verify HBase Installation:

Verify HBase installation on each machine in the cluster. It can be verified from the terminal with below command

Successful installation will result in messages as shown in below screen shot.

hbase version

Configure hbase-env.sh:

In each machine of the hbase cluster, we need below changes in hbase-env.sh file in HBASE_HOME/conf directory to setup the HBase properly.

  • Update the JAVA_HOME environment variable with Java installation directory.
  • Set the property HBASE_MANAGES_ZK=true in hbase-env.sh, if we want Hbase to manage it’s own instance of zoo keeper.
  • Instead of /tmp directory for HBASE_PID_DIR, create /var/hbase/pids directory on each machine and give full permissions to hbase user account and assign this directory to HBASE_PID_DIR as

export HBASE_PID_DIR=/var/hbase/pids

The above three settings are needed in all the machines in the cluster.

Configure hbase-site.xml:
In Master Node:

In Master node, below are the minimum required properties that needs to be setup as part of Hbase cluster configuration.

In listing the hostnames in hbase.zookeeper.quorum property, try to maintain odd number of machines like 1, 3 or 5.

In Slave Nodes:

Note: The property hbase.cluster.distributed needs to be set as true in both master and slave nodes otherwise hbase daemons will not be triggered on slave nodes and it will function as a pseudo distribution mode in master node.

Configure regionservers:

This configuration is applicable only on Master node but there is no change required on slave nodes.

On Master node HBASE_HOME/conf/regionservers file needs to updated with the list of host names of all the slave nodes on which we want to run region server daemons.

Start/Stop Hbase:

Now the configuration is completed and we are ready to start the hbase daemons. Hbase daemons can be started with start-hbase.sh command from Master node.

It will automatically triggers HMaster, HQuorum and HRegionServer daemons on Master node and HQuorum, HRegionServer daemons on slave nodes.

Below is a sample screen shot of start-hbase.sh and daemons on Master node:

start-hbase

Daemons running on Slave Node:

dn-jps

We can stop all these daemons on Master node and slave nodes as well with a single command stop-hbase.sh.


Profile photo of Siva

About Siva

Senior Hadoop developer with 4 years of experience in designing and architecture solutions for the Big Data domain and has been involved with several complex engagements. Technical strengths include Hadoop, YARN, Mapreduce, Hive, Sqoop, Flume, Pig, HBase, Phoenix, Oozie, Falcon, Kafka, Storm, Spark, MySQL and Java.


Leave a comment

Your email address will not be published. Required fields are marked *

One thought on “Hbase Installation in Fully Distribution Mode


Review Comments
default gravatar

I am a plsql developer. Intrested to move into bigdata.

Neetika Singh ITA