Kafka Installation and Test Broker Setup


In this post we will discuss about Kafka Installation and Test Broker Setup in Ubuntu machine in a stand-alone zookeeper mode.

Apache Kafka:

Open Source Message Broker from Apache Software Foundation.  Initially developed at LinkedIn and later contributed to Open-source community. It is written in Scala.

Kafka Provides a unified, high-throughput, low-latency platform for handling real-time data feeds.

Types of data being transported through Kafka?

  • Metrics: operational telemetry data
  • Tracking: everything a LinkedIn.com user does
  • Queuing: between LinkedIn apps, e.g. for sending emails

 

Why is Kafka so fast?

Fast writes:

  • While Kafka persists all data to disk, essentially all writes go to thepage cache of OS, i.e. RAM.
  • Cf. hardware specs and OS tuning (we cover this later)

Fast reads:

  • Very efficient to transfer data from page cache to a network socket
  • Linux: sendfile() system call

Example (Operations): On a Kafka cluster where the consumers are mostly caught up you will see no read activity on the disks as they will be serving data entirely from cache.

Core ComponentsKafka Architecture

 

  • Topics, partitions, replicas, offsets
  • Producers, brokers, consumers
  • Producers write data to brokers.
  • Consumers read data from brokers.
  • All this is distributed.
  • Data is stored in topics.
  • Topics are split into partitions, which are replicated.

Step 1: Download

First download latest statble verion of kafka from Apache Download Mirrors . In this post we are installing kafka_2.11-0.8.2.1.tgz and untar it into /usr/lib/kafka folder.

Step 2:

Add below entries into .bashrc file

### kafka Home directory ####

Step 3 : Start ZooKeeper

Kafka uses ZooKeeper so you need to first start a ZooKeeper server if you don’t already have one. You can use the convenience script packaged with kafka to get a quick-and-dirty single-node ZooKeeper instance.

Step 4: Start the Kafka Server

Open another terminal and start Kafka Server

Step 5: Create Test Topic

Open another terminal. Create a topic named “test” with a single partition and only one replica:

Use the below list command to view the topic

Run the producer and then type a few messages into the console to send to the consumer.

Step 6 : Start Consumer

Open another terminal and see producers messages being Consumed

If you receive messages like above, You have successfully setup Kafka Console Producer and Consumer.


About Siva

Senior Hadoop developer with 4 years of experience in designing and architecture solutions for the Big Data domain and has been involved with several complex engagements. Technical strengths include Hadoop, YARN, Mapreduce, Hive, Sqoop, Flume, Pig, HBase, Phoenix, Oozie, Falcon, Kafka, Storm, Spark, MySQL and Java.

Leave a comment

Your email address will not be published. Required fields are marked *


Review Comments
default image

I am a plsql developer. Intrested to move into bigdata.

Neetika Singh ITA Hadoop in Dec/2016 December 22, 2016

.