Introduction

This is a memo I tried using Diamond, a metrics collection tool. We used the standard Graphite + Grafana as a base for storing and visualizing time-series data.

Time series data storage and visualization

First of all, I will briefly introduce Graphite and Grafana. Graphite provides the following services.

Save time series data
API for retrieving saved data

Graphite can be visualized by itself, but it is hard to say that it is sophisticated, so Graphite only provides API, and it seems that the visualization part such as dashboard generation often uses another tool. As this visualization tool, I used Grafana this time.

By using Grafana, you can easily create a stylish dashboard like this, for example.

The following sites will be helpful for Graphite and Grafana. Visualize Chef execution results with Chef Handlers + Graphite Graphite Documentation Memo that I tried using Graphite and Grafana for about an hour Create a dashboard to display Graphite data using Grafana Grafana Official

metrics collection tool, Diamond

What is Diamond

This is the main part of this time.

In the following, "metrics" and "data" are used without much distinction. Metrics are, for example, cpu usage, load average, or "measured indicators", and become data by being collected by some processing. In other words, it's a different thing to be exact, but I will not do it here either, for example, when I say "load average was xxx" so as not to distinguish whether it is an index called "load average" or the data itself.

Diamond Diamond can be broadly divided into two components

collector
handler

The collector is responsible for collecting the data. Just collecting it is not the same as providing the collected data outside of Diamond. The handler handles the collected data. For example, you can write the data obtained by collector to a local file, send it to Graphite (maybe carbon), save it in a MySQL table, skip an alert if there is something wrong, and so on.

Also, Diamond provides a daemon that runs the collector on a regular basis, so you don't have to write your own scheduler.

Diamond's strengths

That said, whether it's a strength or not is subtle, but Diamond has a fairly large number of collectors provided by the community. Diamond/Collectors Various system information, famous places such as apache, nginx, MySQL, PostgreSQL, Redis, MongoDB, DRBD, and relatively maniac metrics such as Openstack Swift are collected.

Move for the time being

Installation

Basically, I will do it while looking here. Even though it is a python main project, I build it using make. Diamond/Usage

For reference, the system information of the installed machine.

-(risuo@ebi)-(0)-
-[9362]% cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=12.10
DISTRIB_CODENAME=quantal
DISTRIB_DESCRIPTION="Ubuntu 12.10"
-(risuo@ebi)-(0)-
-[9369]% uname -a
Linux ebi 3.5.0-23-generic #35-Ubuntu SMP Thu Jan 24 13:05:29 UTC 2013 i686 athlon i686 GNU/Linux

Edit configuration file

First, edit diamond.conf. By default it looks like this Diamond / conf / diamond.conf.example If you install it normally, /etc/diamond/diamond.conf.example will be created, so make a copy with the name diamond.conf in the same directory.

If you want to send data to Graphite, set here first

[[GraphiteHandler]]
### Options for GraphiteHandler

# Graphite server host
host = graphite

# Port to send metrics to
port = 2003

# Socket timeout (seconds)
timeout = 15

# Batch size for metrics
batch = 1

In the host part, write the host of the Graphite server. Set the port number as needed. Graphite must be specified in the handlers in the [server] section, but this is the default, so you don't need to be aware of it for the time being.

As mentioned above, it works for the time being.

sudo /etc/init.d/diamond start
sudo service diamond start

Start the daemon with.

Wait for a while and then look at Grafana's dashboard. It is successful if you can select metrics that look like it has system information. The image below shows metrics related to cpu, diskspace, memory, iostat, load average.

Troubleshooting should be done by chewing on /var/log/diamond/diamong.log.

tips

I want to set the interval of acquisition time of metrics

Start by specifying the interval value of conf of various collectors. You can also set global default values in diamond.conf. You can actually embed it in the source code. You can see how to write it by reading the get_default_config method. The following page is detailed. Diamond/Configuration

Write like this (unit is seconds)

interval = 60

I want to set up logger

You can specify the log output format, format, and output destination. You can get a general idea by reading Diamond / Configuration.

I want to set a handler

Specify the handlers you want to enable in the handlers in the [server] section. By default, GraphiteHandler and ArchiveHandler are enabled, but they are specified like this.

handlers = diamond.handler.graphite.GraphiteHandler, diamond.handler.archive.ArchiveHandler

For example, if you want to pickle and send, you need to set GraphitePickleHandler in the [server] section. If you want to save the acquired data in MySQL, specify MySQLHandler.

The community also provides an InfluxdbHandler, a handler that stores in Influxdb. Grafana can also be drawn from Influxdb, so choose the one you like.

The settings for each handler are described in the [handlers] section.

I want to develop a custom collector

You can develop a custom collector by extending diamond.collector.Collector (if you do it normally, make it a derived class). Among the methods that diamond.collector.Collector has, the "publish" related method is the interface between collector and handler. Also, diamond.collector.Collector has an unimplemented "collect" method that allows developers to implement collect in their derived classes. The following page will be helpful. Diamond/CustomCollectors

However, since the amount of information is small with this alone, I think that you should read the implementation of the Collector class if you develop a custom collector. collector.py

For example, the publish method has an argument called precision, which specifies the number of significant digits in metrics. This is 0 by default, that is, it is set to handle only integers. There are quite a lot of situations where you want to handle metrics with float, but it seems that you can not find the solution unless you read the source code.

Summary / impression

It's easy, but I wrote about Diamond. It's important to keep the data visible, so I think it's a good idea to keep these tools in the future.

[PYTHON] I played with Diamond, a metrics collection tool