Zookeeper Commands 1


This post is about some notes on Zookeeper commands and scripts. This is mainly useful for Hadoop Admins and all commands are self explanotry.

  • ZooKeeper is a distributed centralized co-ordination service
  • Zookeeper addresses issues with distributed applications:
    • Maintain configuration information (share config info across all nodes)
    • Naming Service(allows one node to find a specific machine in a cluster of 1000’s of servers)
    • Distributed synchronization (locks, barriers, queues, etc)
    • Group services (leader election, etc)
  • ZooKeeper – Replicated mode
    • One leader and rest are followers
    • At the beginning, client is provided with a list of servers; it connects to a single server
    • Client can read from any server; but write operation must go through the leader
  • ZooKeeper – Standalone mode
    • Single server
    • All clients connect to this one server
    • Good for testing/learning
  • ZooKeeper provides 6 consistency guarantees:
    • Sequential Consistency(client updates are applied in the order they are received)
    • Atomicity (Updates succeed OR fail; No partial updates)
    • Single system Image (client sees same view of the ZooKeeper service regardless of the server it is connected to)
    • Reliability (if updates succeed, then it is persisted; the information is updated again only on another client update)
    • Timeliness (client view of the system is guaranteed up-to-date within a time bound(10 seconds)
  • ZooKeeper DOES NOT guarantee:
    • Simultaneously consistent cross-client views (different clients may see different (version) of Zookeeper data at the same time); Update done by one client may NOT get notified immediately to another client because of network delays
    • You can use sync() to overcome this issue; it forces a Zookeeper ensemble server to catch-up with the leader
  • Distributed processes coordinate through shared hierarchical namespace
  • A name space consists of a root node and 1 or more child nodes; these nodes are called znodes
  • Data access is atomic (Reads/Writes everything or nothing) (no partial reads/writes)
  • Each znode has an ACL (access control list)
  • Znodes are used to store data <1 MB (zookeeper is designed for coordination using small data files and not for high volume storage)
  • 2 types of znodes:
    • Ephermeral (deleted when client’s session ends) (no child nodes)
    • Persistent (can be deleted by any client)(can have child nodes)
  • Sequential znodes:
    • You can attach a number to the name
    • Can be used for hierarchical ordering
    • Client can parse the name to see the order of creation
  • Time in Zookeeper:
    • Zxid
    • Version
    • Ticks
    • Session timeouts
  • Znode statuses:
    • czxid, mzid
    • ctime, mtime
    • version, cversion, aversion
    • ephemeralOwner
    • dataLength
    • numChildren
  • Zookeeper states: Zookeeper objects (like zode) can be in ONLY one of the following states at a time:
    • Connectiing
    • Connected
    • Closed
  • Use getState() to get the state of the object
  • Zookeeper watches
    • Allows applications to get notified of a change
    • 3 properties of a watch:
      • one-time trigger (notified once per watch)
      • trigger occurs when the data o state changes
      • notification is sent to the client that set the watch
    • Watches are used to:
      • Notify of changes in znodes
      • Notify of changes in states
  • Zookeeper clients
    • Java, c, python, perl
    • Apache curator (developed by netflix)
      • Set of java libraries that make using of Zookeeper easier
      • Code example: client.create().forPath(“my/path”, mydata)
  • Zookeeper recipes
    • Higher order functions implementing the zookeeper service
    • You can use these in your applications for specific needs
    • Out of the box recipes:
      • Name service
      • Configuration
      • Group membership
    • Additional recipes
      • Barriers
      • Locks
      • Two phase commit
      • Leader election
      • Queues
    • Curator framework implements all these recipes except two phase commit
  • Zookeeper atomic broadcast (Zab)
    • Broadcast protocol to propagate state changes from the leader
  • When a Zookeeper object is created, 2 threads are spawned:
    • IO Thread (for all IO)
    • Event Thread (event callbacks)
  • Zookeeper java API
    • ATTACH SCREENSHOT
  • Zookeeper Exceptions
    • Interrupted Exception
    • Keeper Exception : 3 categories
      • State exceptions (ex: 2 process try to update a znode at same time)
      • Recoverable exceptions (ex: KeeperException.ConnectionLost Exception)
      • Unrecoverable exceptions (ex: KeeperException.SessionExpired Exception)
  • ACL
    • ACL Permissions
    • CREATE
    • READ
    • WRITE
    • DELETE
    • ADMIN
    • Ex: ip:19.23.0.0/16, READ

Zookeeper Commands

 

LIST THE TOP LEVEL OF THE ZOOKEEP NODE

CREATE A NEW ZNODE

GET command TO VIEW the data and metadata

NOTES:

  • ‘first_version’ is the data stored in the znode
  • cZxid is the transaction id of the change that caused this node to be created
  • ctime is the time when this znode was created
  • mZxid is the change that last modified this znode
  • mtime is the time when this znode was modified
  • pZxid is the transaction id of the change that last modified childern of this znode
  • cversion is the number of changes to the children of this znode
  • dataVersion is the number of changes to the data of this znode
  • aclVersion is the number of changes to the ACL of this znode
  • ephemeralOwner is the seesion id of the owner of this znode if the znode is an ephemeral node; if not ephemeral node, it will be zero
  • dataLength is the data length
  • numChildren is number of children

SET to change the value

NOTE: mtime, dataVersion and dataLength have changed

ADD 2 sequential nodes to our node

DELETE

GROUP MEMBERSHIP

TERMINAL 2:

NOTE: -e option will create an ephemeral znode

TERMINAL 3:

NOW CLOSE TERMINALS 2 & 3 AND IN THE MAIN TERMINAL:

MONITORING ZOOKEEPER

CONGIGURATION ON THE SERVER

CONNECTION/SESSION DETAILS

STATS

DUMP

RUOK (ARE YOU OK?)

WATCHES


About Siva

Senior Hadoop developer with 4 years of experience in designing and architecture solutions for the Big Data domain and has been involved with several complex engagements. Technical strengths include Hadoop, YARN, Mapreduce, Hive, Sqoop, Flume, Pig, HBase, Phoenix, Oozie, Falcon, Kafka, Storm, Spark, MySQL and Java.


Leave a comment

Your email address will not be published. Required fields are marked *

One thought on “Zookeeper Commands

  • Utsav

    How to calculate the size of table using above commands. I have to test compression on my standalone hbase. What I found was using hdfs. But as my hbase is installed as standalone, so could not found any sources for the same.


Review Comments
default image

I have attended Siva’s Spark and Scala training. He is good in presentation skills and explaining technical concepts easily to everyone in the group. He is having excellent real time experience and provided enough use cases to understand each concepts. Duration of the course and time management is awesome. Happy that I found a right person on time to learn Spark. Thanks Siva!!!

Dharmeswaran ETL / Hadoop Developer Spark Nov 2016 September 21, 2017

.