HBase


HBase & Solr Search Integration 1

­HBase & Solr – Near Real time indexing and search Requirement: A. HBase Table B. Solr collection on HDFS C. Lily HBase Indexer. D. Morphline Configuration file Once Solr server ready then we are ready to configure our collection (in solr cloud); which will be link to HBase table. Add below properties to hbase-site.xml file. Add below properties to/etc/hbase-solr/conf/hbase-indexer-site.xml. This will enable Lily indexer to reach HBase cluster for indexing. […]


HBase Functions Cheat Sheet 3

HBase Functions Cheat Sheet SHELL [cloudera@quickstart ~]$ hbase shell LIST hbase(main):003:0> list SCAN Scanner specifications may include one or more of: TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH, or COLUMNS, CACHE. If no columns are specified, all columns will be scanned. To scan all members of a column family, leave the qualifier empty as in ‘col_family:’ hbase(main):012:0> scan ‘myFirstTable’ SCAN WITH FILTER hbase(main):079:0> scan ‘sales_fact’, { FILTER => “KeyOnlyFilter()”} –> […]


HBase Shell Commands in Practice 6

In Our previous posts we have seen HBase Overview and HBase Installation, now it is the time to practice some Hbase Shell Commands to get familiarize with HBase. We will test a few Hbase shell commands in this post. HBase Shell Usage Quote all names in HBase Shell such as table and column names. Commas delimit command parameters. Type <RETURN> after entering a command to run it. Dictionaries of configuration used […]


Flume Data Collection into HBase 5

We will discuss about collection of data into HBase directly through flume agent. In our previous posts under flume category, we have covered setup of flume agents for file roll, logger and HDFS sink types. In this, we are going to explore the details of HBase sink and its setup with live example. As we have already covered File channel , Memory channel and JDBC Channel, so we will try to make […]


Hbase Daemons in Pseudo Distribution Mode

In Hbase cluster, we can start hbase daemons with start-hbase.sh command or

But in pseudo distribution mode (hbase.cluster.distributed=false), only HMaster daemon will be triggered but not the HRegionServer daemon or HQuorumPeer daemon. When we start the daemons with start-hbase.sh or individual hbase-daemon.sh commands for region server will not trigger daemon because of the below condition in start-hbase.sh script.

When we try to trigger the regionserver daemon through hbase-daemon.sh command we […]


HBase Installation in Pseudo Distribution Mode 4

This post describes the procedure for HBase Installation on Ubuntu Machine in pseudo distributed mode using HDFS configuration. Prerequisites: Java is one of the main prerequisite. JDK 1.6 or later versions of Java installation is required to run HBase. Hadoop 1 or Hadoop 2 installed on pseudo distributed or fully distributed cluster. HBase Installation Procedure: Follow below steps in the same order to complete the HBase Installation on Ubuntu machine. […]


HBase Overview

HBase is the Hadoop’s database and Below is the high level HBase Overview. HBase Overview: What is HBase ? HBase is a scalable distributed column oriented database built on top of Hadoop and HDFS. Apache HBase is open-source non-relational database implemented based on Google’s Big Table – A Distributed storage system for structured data. HBase provides random and real time read/write access to Big Data. Need For HBase: Although most […]