To monitor your Kafka server in production, probably the most straight forward way to get useful metrics for monitoring is to use JMX. This is totally possible, but HTTP can be much more convenient especially if you are not in a Java dominated environment.
Although JMX is available built-in, Kafka is using Yammer Metrics that makes it relatively easy to add further monitoring endpoints. Thanks to Arno Broekhof there is a project on Github that implements it right away and exposes an HTTP based interface where the monitoring data can be fetched in JSON format. It is using Kafka version 0.8, but there are a couple of forks of the project that migrates to a newer Kafka version. Mine is included, which is using the latest Kafka version 0.10.1.0 by the time I am writing this post - you can find it here.
To build it, basically you need to clone the project and run
mvn package that will build you a jar file. You need to copy this jar file to your Kafka installation’s libs dir and extend your
server.properties file to enable the HTTP monitoring reports. Mine is looking something like this:
This will enable the reporter. You can also configure the HTTP port the built-in Jetty HTTP server will use. Also, you can specify the network interface to listen on. In my case, I just use localhost as my monitoring agent will collect and process this info from the local machine.
After restarting your Kafka server, you should be able to get metrics data via HTTP:
As the output is JSON, you might want to process it with a small script like python:
The above command will give you the number of incoming messages the Kafka broker is dealing with.
I am using check_mk to monitor the service, but I think Nagios is using the same format anyway. My monitoring script is looking something like this (without all the specific details):
I could use python for the whole script but I prefer to use it only for the JSON parsing and keep the rest of the code in bash, just for consistency with other checks.