HBase Functions Cheat Sheet 3

HBase Functions Cheat Sheet

[cloudera@quickstart ~]$ hbase shell


hbase(main):003:0> list


Scanner specifications may include one or more of: TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH, or COLUMNS, CACHE. If no columns are specified, all columns will be scanned. To scan all members of a column family, leave the qualifier empty as in ‘col_family:’
hbase(main):012:0> scan ‘myFirstTable’

hbase(main):079:0> scan ‘sales_fact’, { FILTER => “KeyOnlyFilter()”} –> Returns Key
hbase(main):080:0> scan ‘sales_fact’, { FILTER => “FirstKeyOnlyFilter()”} –> Returns Key
hbase(main):085:0> scan ‘sales_fact’, { COLUMNS => [‘cf:ek’],FILTER => “PrefixFilter(‘200611’)”, LIMIT => 10} –> RowID prefix filter
hbase(main):004:0> show_filters
ColumnPrefixFilter –> ColumnPrefixFilter(<column_name_prefix>)
PageFilter –>“PageFilter(1)” //No of pages
MultipleColumnPrefixFilter –> “MultipleColumnPrefixFilter(‘col1′,’col2′,’col3’)”
FamilyFilter –> FamilyFilter(‘column_family_name’)
SingleColumnValueFilter –> “SingleColumnValueFilter(‘cf1′,’col1’,=, ‘binary:india’)” }
QualifierFilter –> QualifierFilter (<compareOp>, <qualifier_comparator>)
ValueFilter –> “ValueFilter( =, ‘binaryprefix:india’ )” }
PrefixFilter –> PrefixFilter (<row_prefix>) –> FILTER => “PrefixFilter(‘row1’)"
InclusiveStopFilter –> “InclusiveStopFilter(‘stoprowid_is_included’)”
FirstKeyOnlyFilter –> No Argument –> FirstKeyOnlyFilter ()
KeyOnlyFilter –> No Argument –> KeyOnlyFilter ()

QualifierFilter –> { FILTER => “QualifierFilter(=,’binary:columnname1′)”}

hbase(main):004:0> put ‘myFirstTable’, ‘row1’, ‘myColumnFamily:columnA’, ‘value1’
hbase(main):013:0> put ‘myFirstTable’, ‘row2’, ‘myColumnFamily:columnB’, ‘value2’
hbase(main):015:0> put ‘myFirstTable’, ‘row2’, ‘myColumnFamily:columnC’, ‘value3’
hbase(main):017:0> put ‘myFirstTable’, ‘row3’, ‘myColumnFamily:columnC’, ‘value3’
hbase(main):002:0> scan ‘sales_fact’, {COLUMNS => [‘cf:ek’], LIMIT => 10, STARTROW => ‘20040113’, STOPROW => ‘20040115’}
hbase(main):023:0> get ‘myFirstTable’, ‘row1’
hbase(main):024:0> get ‘myFirstTable’, ‘row2’

hbase(main):002:0> get ‘sales_fact’, ‘20060419’
hbase(main):005:0> get ‘sales_fact’, ‘20060419’, {COLUMN => [‘cf:ek’,’cf:q’,’cf:up’]}
hbase(main):006:0> get ‘sales_fact’, {COLUMN => [‘cf:ek’,’cf:q’,’cf:up’]}
hbase(main):006:0> scan ‘sales_fact’, {COLUMNS => [‘cf:ek’,’cf:q’], LIMIT => 10}
hbase(main):008:0> scan ‘sales_fact’, {COLUMNS => [‘cf:ek’], LIMIT => 10, STARTROW => ‘20040113’}

hbase(main):037:0> disable ‘myFirstTable’
hbase(main):003:0> drop ‘myFirstTable’

hbase(main):002:0> create ‘myFirstTable’, ‘myColumnFamily’

hbase(main):030:0> describe ‘sales_fact’

hbase(main):001:0> create ‘sales_fact’, {NAME => ‘cf’, VERSIONS => 1}
hbase(main):003:0> describe ‘sales_fact’

[cloudera@localhost ~]$ hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,cf:ok,cf:ek,cf:rk,cf:rsk,cf:pdk,cf:pmk,cf:omk,cf:sok,cf:sdk,cf:cdk,cf:q,cf:uc,cf:up,cf:usp,cf:gm,cf:st,cf:gp -Dimporttsv.skip.bad.lines=false ‘sales_fact’ hdfs://localhost:8020/user/cloudera/input/SLS_SALES_FACT.txt
–NOTE: Above ImportTsv command runs a map reduce job and loads the data

hbase(main):003:0> count ‘sales_fact’
hbase(main):013:0> count ‘sales_fact’, INTERVAL => 100
hbase(main):014:0> count ‘sales_fact’, INTERVAL => 1000
hbase(main):015:0> count ‘sales_fact’, INTERVAL => 10000
hbase(main):024:0> whoami
hbase(main):021:0> version
hbase(main):022:0> status
hbase(main):023:0> table_help
hbase(main):028:0> is_disabled ‘sales_fact’
0 row(s) in 0.0510 seconds

hbase(main):029:0> is_enabled ‘sales_fact’

hbase(main):031:0> exists ‘sales_fact’


[cloudera@quickstart ~]$ hbase
Usage: hbase [<options>] <command> [<args>]
–config DIR Configuration direction to use. Default: ./conf
–hosts HOSTS Override the list in ‘regionservers’ file

Some commands take arguments. Pass no args or -h for usage.

shell Run the HBase shell
hbck Run the hbase ‘fsck’ tool
hlog Write-ahead-log analyzer
hfile Store file analyzer
zkcli Run the ZooKeeper shell
upgrade Upgrade hbase
master Run an HBase HMaster node
regionserver Run an HBase HRegionServer node
zookeeper Run a Zookeeper server
rest Run an HBase REST server
thrift Run the HBase Thrift server
thrift2 Run the HBase Thrift2 server
clean Run the HBase clean up script
classpath Dump hbase CLASSPATH
mapredcp Dump CLASSPATH entries required by mapreduce
pe Run PerformanceEvaluation
ltt Run LoadTestTool
version Print the version
CLASSNAME Run the class named CLASSNAME

About Siva

Senior Hadoop developer with 4 years of experience in designing and architecture solutions for the Big Data domain and has been involved with several complex engagements. Technical strengths include Hadoop, YARN, Mapreduce, Hive, Sqoop, Flume, Pig, HBase, Phoenix, Oozie, Falcon, Kafka, Storm, Spark, MySQL and Java.

Leave a comment

Your email address will not be published. Required fields are marked *

3 thoughts on “HBase Functions Cheat Sheet

  • Eswar

    Hi Shiva,

    Thank you for the valuable information.

    Please suggest me on the below scenario.

    I have created the path to store the Hbase tables as shown below

    drwxrwxr-x – ***** ***** /folder1/folder2/project/hbase tables

    and I have granted the user with the permissions ‘C’ thru command line

    grant ‘user’,’C’,’habse:table’.

    -> if the user tired to use the commands “SCAN” or “GET” on habse:tables, it should restrict him to access the row details or details of table.

    -> But in my scenario user is able to see Hbase:table data.

    Could you please suggest me how to restrict the user using “SCAN & GET” commands.For suppose the user want to hack the table or row data, we should be able to restrict him with the permission issues.

    Please provide your suggestion on the above..

    Thank you in advance

Review Comments
default image

I have attended Siva’s Spark and Scala training. He is good in presentation skills and explaining technical concepts easily to everyone in the group. He is having excellent real time experience and provided enough use cases to understand each concepts. Duration of the course and time management is awesome. Happy that I found a right person on time to learn Spark. Thanks Siva!!!

Dharmeswaran ETL / Hadoop Developer Spark Nov 2016 September 21, 2017