Kafka on MacBook & Simple Benchmarking

1. Brief Introduction – Apache Kafka

logo

Apache Kafka a distributed streaming platform and widely used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.

Official website: https://kafka.apache.org/

2. Installation Steps

Environment

  • Operating System: macOS Sierra
  • Processor: 2.8 GHz Intel Core i7
  • Memory: 16 GB 1600 MHz DDR3
  • Storage: 500 GB SSD

Prerequisite Tools

Install Apache Zookeeper through Homebrew

Kafka servers require Zookeeper for multiple reasons including leader election, membership management, etc.
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.

$ brew install zookeeper
==> Downloading https://homebrew.bintray.com/bottles/zookeeper-3.4.9.sierra.bottle.tar.gz
##################################################### 100.0%
==> Pouring zookeeper-3.4.9.sierra.bottle.tar.gz
==> Caveats
To have launchd start zookeeper now and restart at login:
brew services start zookeeper
Or, if you don't want/need a background service you can just run:
zkServer start
==> Summary
🍺 /usr/local/Cellar/zookeeper/3.4.9: 238 files, 18.2M

Install Apache Kafka through Homebrew

$ brew install kafka
==> Downloading https://homebrew.bintray.com/bottles/kafka-0.10.1.0.sierra.bottle.tar.gz
##################################################### 100.0%
==> Pouring kafka-0.10.1.0.sierra.bottle.tar.gz
==> Caveats
To have launchd start kafka now and restart at login:
brew services start kafka
Or, if you don't want/need a background service you can just run:
zookeeper-server-start /usr/local/etc/kafka/zookeeper.properties & kafka-server-start /usr/local/etc/kafka/server.properties
==> Summary
🍺 /usr/local/Cellar/kafka/0.10.1.0: 128 files, 35.2M

Start Zookeeper

$ brew services start zookeeper
==> Successfully started `zookeeper` (label: homebrew.mxcl.zookeeper)

Start Kafka

$ brew services start kafka
==> Successfully started `kafka` (label: homebrew.mxcl.kafka)

Kafka Command Line Tools

$ kafka-
kafka-acls                        kafka-mirror-maker                kafka-server-start
kafka-configs                     kafka-preferred-replica-election  kafka-server-stop
kafka-console-consumer            kafka-producer-perf-test          kafka-simple-consumer-shell
kafka-console-producer            kafka-reassign-partitions         kafka-streams-application-reset
kafka-consumer-groups             kafka-replay-log-producer         kafka-topics
kafka-consumer-offset-checker     kafka-replica-verification        kafka-verifiable-consumer
kafka-consumer-perf-test          kafka-run-class                   kafka-verifiable-producer

3. Simple Benchmarking

Create Topics

Create a topic test-topic-with-1partition-1replication with 1 partition and 1 replication

$ kafka-topics --zookeeper localhost:2181 --create --topic test-topic-with-1partition-1replication --partitions 1 --replication-factor 1
Created topic "test-topic-with-1partition-1replication".

Producer benchmarking

Create a single producer for pushing 10 million messages into test-topic-with-1partition-1replication

$ kafka-producer-perf-test  --topic test-topic-with-1partition-1replication --num-records 10000000 --record-size 100 --throughput -1 --producer-props acks=1 bootstrap.servers=localhost:9092 buffer.memory=67108864 batch.size=8192
1373823 records sent, 274654.7 records/sec (26.19 MB/sec), 1351.5 ms avg latency, 2170.0 max latency.
2747928 records sent, 549585.6 records/sec (52.41 MB/sec), 911.2 ms avg latency, 1046.0 max latency.
2719929 records sent, 543985.8 records/sec (51.88 MB/sec), 912.5 ms avg latency, 999.0 max latency.
2576457 records sent, 515291.4 records/sec (49.14 MB/sec), 968.8 ms avg latency, 1120.0 max latency.
10000000 records sent, 479064.865383 records/sec (45.69 MB/sec), 982.90 ms avg latency, 2170.00 ms max latency, 918 ms 50th, 1574 ms 95th, 2043 ms 99th, 2154 ms 99.9th.

Consumer benchmarking

Create a single thread consumer to consume all messages in test-topic-with-1partition-1replication

$ kafka-consumer-perf-test --topic test-topic-with-1partition-1replication --zookeeper localhost:2181 --threads 1 --messages 10000000
start.time,   end.time,     data.consumed.in.MB, MB.sec,   data.consumed.in.nMsg, nMsg.sec
10:46:39:015, 10:46:42:403, 953.6743,            281.4859, 10000000,              2951593.8607
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s