HBase Functions Cheat Sheet 2


HBase Functions Cheat Sheet

SHELL
[cloudera@quickstart ~]$ hbase shell

LIST

hbase(main):003:0> list

SCAN

Scanner specifications may include one or more of: TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH, or COLUMNS, CACHE. If no columns are specified, all columns will be scanned. To scan all members of a column family, leave the qualifier empty as in ‘col_family:’
hbase(main):012:0> scan ‘myFirstTable’

SCAN WITH FILTER
hbase(main):079:0> scan ‘sales_fact’, { FILTER => “KeyOnlyFilter()”} –> Returns Key
hbase(main):080:0> scan ‘sales_fact’, { FILTER => “FirstKeyOnlyFilter()”} –> Returns Key
hbase(main):085:0> scan ‘sales_fact’, { COLUMNS => [‘cf:ek’],FILTER => “PrefixFilter(‘200611’)”, LIMIT => 10} –> RowID prefix filter
SHOW FILTERS
hbase(main):004:0> show_filters
ColumnPrefixFilter –> ColumnPrefixFilter(<column_name_prefix>)
TimestampsFilter
PageFilter  –>“PageFilter(1)” //No of pages
MultipleColumnPrefixFilter –> “MultipleColumnPrefixFilter(‘col1′,’col2′,’col3’)”
FamilyFilter –> FamilyFilter(‘column_family_name’)
ColumnPaginationFilter
SingleColumnValueFilter –> “SingleColumnValueFilter(‘cf1′,’col1’,=, ‘binary:india’)” }
RowFilter
QualifierFilter –> QualifierFilter (<compareOp>, <qualifier_comparator>)
ColumnRangeFilter
ValueFilter –> “ValueFilter( =, ‘binaryprefix:india’ )” }
PrefixFilter –> PrefixFilter (<row_prefix>) –> FILTER => “PrefixFilter(‘row1’)”
SingleColumnValueExcludeFilter
ColumnCountGetFilter
InclusiveStopFilter –> “InclusiveStopFilter(‘stoprowid_is_included’)”
DependentColumnFilter
FirstKeyOnlyFilter –> No Argument –> FirstKeyOnlyFilter ()
KeyOnlyFilter –> No Argument –> KeyOnlyFilter ()

QualifierFilter –> { FILTER => “QualifierFilter(=,’binary:columnname1′)”}

PUT
hbase(main):004:0> put ‘myFirstTable’, ‘row1’, ‘myColumnFamily:columnA’, ‘value1’
hbase(main):013:0> put ‘myFirstTable’, ‘row2’, ‘myColumnFamily:columnB’, ‘value2’
hbase(main):015:0> put ‘myFirstTable’, ‘row2’, ‘myColumnFamily:columnC’, ‘value3’
hbase(main):017:0> put ‘myFirstTable’, ‘row3’, ‘myColumnFamily:columnC’, ‘value3’
hbase(main):002:0> scan ‘sales_fact’, {COLUMNS => [‘cf:ek’], LIMIT => 10, STARTROW => ‘20040113’, STOPROW => ‘20040115’}
GET
hbase(main):023:0> get ‘myFirstTable’, ‘row1’
hbase(main):024:0> get ‘myFirstTable’, ‘row2’

hbase(main):002:0> get ‘sales_fact’, ‘20060419’
hbase(main):005:0> get ‘sales_fact’, ‘20060419’, {COLUMN => [‘cf:ek’,’cf:q’,’cf:up’]}
hbase(main):006:0> get ‘sales_fact’, {COLUMN => [‘cf:ek’,’cf:q’,’cf:up’]}
hbase(main):006:0> scan ‘sales_fact’, {COLUMNS => [‘cf:ek’,’cf:q’], LIMIT => 10}
hbase(main):008:0> scan ‘sales_fact’, {COLUMNS => [‘cf:ek’], LIMIT => 10, STARTROW => ‘20040113’}

DISABLE/DROP
hbase(main):037:0> disable ‘myFirstTable’
hbase(main):003:0> drop ‘myFirstTable’

CREATE
hbase(main):002:0> create ‘myFirstTable’, ‘myColumnFamily’

DESCRIBE
hbase(main):030:0> describe ‘sales_fact’

BULK IMPORT
hbase(main):001:0> create ‘sales_fact’, {NAME => ‘cf’, VERSIONS => 1}
hbase(main):003:0> describe ‘sales_fact’

[cloudera@localhost ~]$ hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,cf:ok,cf:ek,cf:rk,cf:rsk,cf:pdk,cf:pmk,cf:omk,cf:sok,cf:sdk,cf:cdk,cf:q,cf:uc,cf:up,cf:usp,cf:gm,cf:st,cf:gp -Dimporttsv.skip.bad.lines=false ‘sales_fact’ hdfs://localhost:8020/user/cloudera/input/SLS_SALES_FACT.txt
–NOTE: Above ImportTsv command runs a map reduce job and loads the data

COUNT
hbase(main):003:0> count ‘sales_fact’
hbase(main):013:0> count ‘sales_fact’, INTERVAL => 100
hbase(main):014:0> count ‘sales_fact’, INTERVAL => 1000
hbase(main):015:0> count ‘sales_fact’, INTERVAL => 10000
hbase(main):024:0> whoami
hbase(main):021:0> version
hbase(main):022:0> status
hbase(main):023:0> table_help
hbase(main):028:0> is_disabled ‘sales_fact’
false
0 row(s) in 0.0510 seconds

IS_ENABLED
hbase(main):029:0> is_enabled ‘sales_fact’

EXISTS
hbase(main):031:0> exists ‘sales_fact’

MISC

[cloudera@quickstart ~]$ hbase
Usage: hbase [<options>] <command> [<args>]
Options:
–config DIR Configuration direction to use. Default: ./conf
–hosts HOSTS Override the list in ‘regionservers’ file

Commands:
Some commands take arguments. Pass no args or -h for usage.

shell                  Run the HBase shell
hbck                 Run the hbase ‘fsck’ tool
hlog                  Write-ahead-log analyzer
hfile                  Store file analyzer
zkcli                 Run the ZooKeeper shell
upgrade            Upgrade hbase
master               Run an HBase HMaster node
regionserver      Run an HBase HRegionServer node
zookeeper         Run a Zookeeper server
rest                    Run an HBase REST server
thrift                  Run the HBase Thrift server
thrift2                Run the HBase Thrift2 server
clean                 Run the HBase clean up script
classpath           Dump hbase CLASSPATH
mapredcp          Dump CLASSPATH entries required by mapreduce
pe                      Run PerformanceEvaluation
ltt                      Run LoadTestTool
version              Print the version
CLASSNAME Run the class named CLASSNAME


About Siva

Senior Hadoop developer with 4 years of experience in designing and architecture solutions for the Big Data domain and has been involved with several complex engagements. Technical strengths include Hadoop, YARN, Mapreduce, Hive, Sqoop, Flume, Pig, HBase, Phoenix, Oozie, Falcon, Kafka, Storm, Spark, MySQL and Java.


Leave a comment

Your email address will not be published. Required fields are marked *

2 thoughts on “HBase Functions Cheat Sheet

  • Eswar

    Hi Shiva,

    Thank you for the valuable information.

    Please suggest me on the below scenario.

    I have created the path to store the Hbase tables as shown below

    drwxrwxr-x    –   ***** ***** /folder1/folder2/project/hbase tables

    and I have granted the user with the permissions ‘C’ thru command line

    grant ‘user’,’C’,’habse:table’.

    -> if the user tired to use the commands “SCAN” or “GET” on habse:tables, it should restrict him to access the row details or details of table.

    -> But in my scenario user is able to see Hbase:table data.

    Could you please suggest me how to  restrict the user using “SCAN & GET” commands.For suppose the user want to hack the table or row data, we should be able to restrict him with the permission issues.

    Please provide your suggestion on the above..

    Thank you in advance


Review Comments
default image

I have attended Siva’s Spark and Scala training. He is good in presentation skills and explaining technical concepts easily to everyone in the group. He is having excellent real time experience and provided enough use cases to understand each concepts. Duration of the course and time management is awesome. Happy that I found a right person on time to learn Spark. Thanks Siva!!!

Dharmeswaran ETL / Hadoop Developer Spark Nov 2016 September 21, 2017

.