Features Graphite Lead image: Lead Image © Maxim Kazmin, 123RF.com
Lead Image © Maxim Kazmin, 123RF.com

Visualizing time series data

Painting by Numbers

Graphite converts confusing columns of time series data into handy diagrams, showing trends in operating system and application metrics at a glance. By Jens-Christoph Brendel

Graphite [1] is hierarchically structured, real-time graphing system (Figure 1). A client collects data from the source, and a Graphite daemon (Carbon [2]) on TCP port 2003 receives the data and stores it in a round-robin database named Whisper.

Graphite is hierarchically structured; various components must play together.
Figure 1: Graphite is hierarchically structured; various components must play together.

A web application then grabs the data from this database and creates charts. The client can be programmed by the user or it can come as a prepared daemon (e.g., collectd [3]). If you like to measure your own applications, you can send performance data to a client like statsd [4].


The easiest way to install Graphite is from the package repository of your distribution. Graphite is available in some of the latest distributions, like Fedora 20 or Ubuntu 14.04 LTS. You can also find ready-made images for a virtualization solution like Docker. A simple docker graphite search shows a number of Graphite images.

Of course, you can also install Graphite from the Git sources [5]. In this case, however, you must consider that not all versions of the needed libraries will work well together, whereas the package maintainers have taken over the task of sorting this out for you.

All packages come with prepared config files. They contain reasonable defaults, so you won't have much to edit. Fedora has the necessary init files, but they are not activated to start automatically on a certain run level.

Example Client

The Git sources contain an example client written in Python that is missing in the Ubuntu package. However, this is not the end of world, because it is fairly easy to write a client in a language of your choice: I used Perl for an example in this article (Listing 1). The only thing the client has to do is send a text string in the format

Listing 1: Perl Example Client

01 #!/usr/bin/perl
03 use strict;
04 use warnings;
05 use IO::Socket;
06 use Sys::CpuLoad;
08 my $remote_host = '';
09 my $remote_port = 2003;
11 # create Socket
12 my $socket = IO::Socket::INET -> new(PeerAddr => $remote_host,
13     PeerPort => $remote_port,
14     Proto => "tcp",
15     Type => SOCK_STREAM)
16     or die "Couldn't connect to $remote_host:$remote_port: $@ \n";
19 while() {
20  my @lavg = Sys::CpuLoad::load();
21  my $ts=time();
22  print $socket "system.loadavg_1min $lavg[0] $ts\n";
23  print $socket "system.loadavg_5min $lavg[1] $ts\n";
24  print $socket "system.loadavg_15min $lavg[2] $ts\n";
25  sleep(60);
26 }
28 close($socket);
<metric_name> <metric_value> <metric_timestamp>

to port 2003 on the Graphite server. The time stamp is an integer value in Unix time (i.e., POSIX time or Epoch time), which is the number of seconds since 00:00:00 Thursday January 1, 1970, Coordinated Universal Time (UTC).

The example client in Listing 1 sends the load average values of the last 1, 5, and 15 minutes (lines 22-24), which you can find in the web GUI under the system node. Dots in the name create folders in the hierarchy; to create a graph, you then just click on one of the values in the web app.


An alternative to a self-written client is collectd. This daemon, which is available in most distros from their package management system, can delivery a variety of performance data to Graphite.

The collectd configuration file, /etc/collectd/collectd.conf, will need some changes. You have to remove the comment characters before the write_graphite plugin entry, adapt the Host line, and set the protocol to tcp.

The rrdtool plugin isn't necessary anymore and should be commented out. Furthermore, you should set AutoLoadPlugin to true. Next, activate the desired plugins listed in the config file; then, restart the daemon with

/etc/init.d/ collectd restart

to send the values of the activated collectd plugins to Graphite. If the daemon is receiving values, you should see a line like

14/07/2014 12:38.21 :: MetricLineReceiver connection with \ established

in the logfile /var/log/carbon/listener.log. If not, an error message may point you to a malformed line. If you have a problem with the connection, you can look in the syslog of the collectd host. If all went well, a collectd entry appears in the tree view of the Graphite host with all the values for the activated collectd plugins (Figure 2):

Graphite tree view detail of the collectd data sources.
Figure 2: Graphite tree view detail of the collectd data sources.


If you are not interested in operating system metrics, but in the performance of your own applications, statsd is your friend. This daemon can count events, measure their duration, and buffer and forward the resulting values on a regular basis to a back end like Graphite.

Libraries for a number of programming languages can increment counters, start timers, and send the results to statsd. To start statsd from /opt/statsd via the Node.js JavaScript server, you can enter the command:

/usr/bin/node ./stats.js ./localConfig.js

Before you do this, however, you need to generate or edit the configuration file localConfig.js, which should contain at least the following lines (you will need to adjust the IP address):

   graphitePort: 2003,
   graphiteHost: "",
   port: 8125

In this example, I use the Perl package Net::Statsd to instrument a benchmark (Listing 2).

Listing 2: Instrumentation with Perl

01 #!/usr/bin/perl
03 use Time::HiRes qw(gettimeofday tv_interval);
04 use Net::Statsd;
06 $Net::Statsd::HOST = 'localhost';    # Default
07 $Net::Statsd::PORT = 8125;           # Default
09 my $t0 = [gettimeofday];
11 my($one, $two, $three) = map { $_ x 4096 } 'a'..'c';
13 $s .= $one;
14 $s .= $two;
15 $s .= $three;
17 my $temp;
18 for (my $i=0; $i<12288; $i++) {
19    $temp=substr($s,length($s)-1,1);
20    $s=$temp.$s;
21    $s = substr($s,0,12288);
22 }
24 $elapsed = tv_interval ( $t0, [gettimeofday]);
25 $elapsed = int($elapsed * 1000 * 1000);
27 Net::Statsd::timing('charbench', $elapsed);

This trivial script first creates a 4KB string of each of the letters a, b, and c; then, it puts the last character of the string into the first position over and over until the string reaches its initial state.

The script also measures the duration of each pass and sends the value via UDP to statsd. Statsd subsequently forwards the value to Graphite. By starting the script as a cron job once a minute, you can create a chart as in Figure 3.

Values passed from the instrumented benchmark, to statsd, and then to Graphite.
Figure 3: Values passed from the instrumented benchmark, to statsd, and then to Graphite.


The web GUI contains the tree view of all configured metrics in the left frame and shows a resulting graph in the right frame. If multiple metrics have a similar range of values, you can superimpose graphs.

In all other cases, use the Dashboard button, which leads to an area in which you can arrange a number of graphs in the same window by choosing metrics either from the tree view at the left or from a menu in a top frame (Figure 4).

Choose the metrics for the dashboard..
Figure 4: Choose the metrics for the dashboard..

Every metric you choose gets its own graph. If you drag one chart on top of another, the graphs are combined (Figure  5).

Dashboard with superimposed charts.
Figure 5: Dashboard with superimposed charts.

You can save dashboards and reload them later, and a search function lets you search for saved dashboards. Chart sizes, periods of time displayed, and automatic refresh intervals can all be customized.


Graphite is released under the Apache 2.0 license. It was written by Chris Davis and is maintained and actively developed by Orbitz.com to "… visualize a variety of operations-critical data including application metrics, database metrics, sales, etc.," and "… handle approximately 160,000 distinct metrics per minute running on two niagra-2 Sun servers on a very fast SAN" [6]. Graphite is thus best used in environments that need to monitor thousands of regularly updated metrics in real time.