Zookeeper Commands 1


This post is about some notes on Zookeeper commands and scripts. This is mainly useful for Hadoop Admins and all commands are self explanotry.

  • ZooKeeper is a distributed centralized co-ordination service
  • Zookeeper addresses issues with distributed applications:
    • Maintain configuration information (share config info across all nodes)
    • Naming Service(allows one node to find a specific machine in a cluster of 1000’s of servers)
    • Distributed synchronization (locks, barriers, queues, etc)
    • Group services (leader election, etc)
  • ZooKeeper – Replicated mode
    • One leader and rest are followers
    • At the beginning, client is provided with a list of servers; it connects to a single server
    • Client can read from any server; but write operation must go through the leader
  • ZooKeeper – Standalone mode
    • Single server
    • All clients connect to this one server
    • Good for testing/learning
  • ZooKeeper provides 6 consistency guarantees:
    • Sequential Consistency(client updates are applied in the order they are received)
    • Atomicity (Updates succeed OR fail; No partial updates)
    • Single system Image (client sees same view of the ZooKeeper service regardless of the server it is connected to)
    • Reliability (if updates succeed, then it is persisted; the information is updated again only on another client update)
    • Timeliness (client view of the system is guaranteed up-to-date within a time bound(10 seconds)
  • ZooKeeper DOES NOT guarantee:
    • Simultaneously consistent cross-client views (different clients may see different (version) of Zookeeper data at the same time); Update done by one client may NOT get notified immediately to another client because of network delays
    • You can use sync() to overcome this issue; it forces a Zookeeper ensemble server to catch-up with the leader
  • Distributed processes coordinate through shared hierarchical namespace
  • A name space consists of a root node and 1 or more child nodes; these nodes are called znodes
  • Data access is atomic (Reads/Writes everything or nothing) (no partial reads/writes)
  • Each znode has an ACL (access control list)
  • Znodes are used to store data <1 MB (zookeeper is designed for coordination using small data files and not for high volume storage)
  • 2 types of znodes:
    • Ephermeral (deleted when client’s session ends) (no child nodes)
    • Persistent (can be deleted by any client)(can have child nodes)
  • Sequential znodes:
    • You can attach a number to the name
    • Can be used for hierarchical ordering
    • Client can parse the name to see the order of creation
  • Time in Zookeeper:
    • Zxid
    • Version
    • Ticks
    • Session timeouts
  • Znode statuses:
    • czxid, mzid
    • ctime, mtime
    • version, cversion, aversion
    • ephemeralOwner
    • dataLength
    • numChildren
  • Zookeeper states: Zookeeper objects (like zode) can be in ONLY one of the following states at a time:
    • Connectiing
    • Connected
    • Closed
  • Use getState() to get the state of the object
  • Zookeeper watches
    • Allows applications to get notified of a change
    • 3 properties of a watch:
      • one-time trigger (notified once per watch)
      • trigger occurs when the data o state changes
      • notification is sent to the client that set the watch
    • Watches are used to:
      • Notify of changes in znodes
      • Notify of changes in states
  • Zookeeper clients
    • Java, c, python, perl
    • Apache curator (developed by netflix)
      • Set of java libraries that make using of Zookeeper easier
      • Code example: client.create().forPath(“my/path", mydata)
  • Zookeeper recipes
    • Higher order functions implementing the zookeeper service
    • You can use these in your applications for specific needs
    • Out of the box recipes:
      • Name service
      • Configuration
      • Group membership
    • Additional recipes
      • Barriers
      • Locks
      • Two phase commit
      • Leader election
      • Queues
    • Curator framework implements all these recipes except two phase commit
  • Zookeeper atomic broadcast (Zab)
    • Broadcast protocol to propagate state changes from the leader
  • When a Zookeeper object is created, 2 threads are spawned:
    • IO Thread (for all IO)
    • Event Thread (event callbacks)
  • Zookeeper java API
    • ATTACH SCREENSHOT
  • Zookeeper Exceptions
    • Interrupted Exception
    • Keeper Exception : 3 categories
      • State exceptions (ex: 2 process try to update a znode at same time)
      • Recoverable exceptions (ex: KeeperException.ConnectionLost Exception)
      • Unrecoverable exceptions (ex: KeeperException.SessionExpired Exception)
  • ACL
    • ACL Permissions
    • CREATE
    • READ
    • WRITE
    • DELETE
    • ADMIN
    • Ex: ip:19.23.0.0/16, READ

Zookeeper Commands

 

LIST THE TOP LEVEL OF THE ZOOKEEP NODE