Infrastructure monitoring using Graphite and StatsD

image.png

Introduction

Collecting metrics about servers, applications, and traffic is an important part of your application development project. There are many issues that can occur in production systems, and data collection and organization can help identify infrastructure bottlenecks and issues.

In this article, we'll discuss Graphite and StatsD, and how they can help form the basis of your monitoring infrastructure. Book a free demo of MetricFire (https://www.metricfire.com/demo/?utm_source=blog&utm_medium=Qiita&utm_campaign=Japan&utm_content=Monitoring%20your%20infrastructure%20with%20StatsD%20and%20Graphite) and see your needs Then sign up for a free trial of Hosted Graphite, a managed Graphite service.

Graphite is a library consisting of several components. Here is a brief description of each component.

Graphite web application

The Graphite Web Application is where you can create graphs and plot data. Web applications can save graph properties and layouts.

Carbon Carbon is Graphite's storage backend. Carbon is basically a daemon that can be configured to run on TCP / UDP ports. To handle the increasing load and configure replication and sharding, you can run multiple Carbon daemons on the same or multiple hosts and use Carbon relays to distribute the load.

Whisper Whisper is the database format used by Graphite to store data.

Whisper reduces the latest high resolution (seconds per second) data to a lower resolution, allowing historical data to be retained for extended periods of time.

Now that we've talked about Graphite, let's talk about StatsD.

StatsD StatsD is a node.js application. It was created to send data points about networks, servers, and applications that can be rendered into a graph.

setup

https://hub.docker.com/r/graphiteapp/docker-graphite-statsd/ Use the docker image at

This is a very simple docker-compose.yml:

version: "3"
services:
  graphite-statsd:
    image: graphiteapp/docker-graphite-statsd
    ports:
      - 2003-2004:2003-2004
      - 2023-2024:2023-2024
      - 8125:8125/udp
      - 8126:8126
      - 80:80

After running this Docker image, when you access http: // localhost, your browser will load the Graphite web application as follows:

image.png

At this point, the Graphite metric should be empty. Let's test the deployment. Send a simple metric to the StatsD daemon using the following command:

 echo "deploys.test.myservice:1|c" | nc -w 1 -u localhost  8125

The syntax here is:

bucket:value|type

Bucket

Bucket is a metric identifier. Metric datagrams of the same bucket and the same type are considered by the server to have the same event. In the above example, we used "deploys.test.myservice" as the bucket.

Value

Value is the number associated with the metric. The meaning of the value depends on the type of metric.

Type

Type determines the type of metric. There are different metric types such as timers, counters, gauges, and histograms.

Timer

Timers are different from counters because they measure time intervals. For example, if you want to measure the time it takes for the REST API to respond, use a timer. A single metric for the timer, for example 100 ms, is not very useful. It is more convenient to combine them at time intervals such as 6 hours. Various submetrics are automatically calculated for each metric, including mean, standard deviation, 50th percentile, 90th percentile, and 95th percentile.

echo "deploys.test.myservice.time:55|ms" | nc -w 1 -u localhost 8125‍

Gauge

Gauges are used for fixed values that can be increased or decreased. For example, gauges can be used to represent the number of threads in an application or the number of jobs in a queue.

This is a carbon web application that shows both counter and timer values in one graph.

image.png

Integration with Node.js

We've just seen how to send metrics via the command line. In reality, some applications, such as Node.js and applications running Java-based servers, generate metrics, so these examples don't apply.

Now let's see how an application written in node.js sends metrics. Consider a fast server running on port 3000, as shown below.

const express = require("express");
const app = express();
 
app.get("/", (req, res) => {
    res.send("Response from a simple GET API");
});
 
app.listen(3000, () => {
    console.log("Node server started on port 3000");
});

First, you need to install node-statsd using npm.

npm i node-statsd --save

Then create an instance of the StatsD client as follows:

const StatsD = require("node-statsd"), client = new StatsD();

The StatsD constructor takes some optional arguments, such as the host and port of the machine running the StatsD server. The complete documentation https://github.com/sivy/node-statsd It is in.

In my case I was running StatsD with the default options [http: // localhost](http: // localhost) and port8125.

After you instantiate the client, you can call various methods to send metrics to your application. For example, you can track the number and timing of API calls as follows:

app.get("/", (req, res) => {
    res.send("Response from a simple GET API");
    client.increment("api_counter");
    client.timing("api_response_time", 110);
});

As soon as you type [http: // localhost: 3000](http: // localhost: 3000) in your browser, the API will be called and the StatsD client will run. You can see the updated metrics in the Graphite web application.

image.png

Check the documentation at [https: //github.com/sivy/node-statsd](https: //github.com/sivy/node-statsd) for all the methods available for your client instance.

Integration with Java

Integration with Java-based clients is very similar to Node.js. If you are using a build system such as Maven or Gradle (highly recommended), a Utility jar (https://mvnrepository.com/artifact/com.timgroup/java-statsd-) to facilitate this integration. client / 3.1.0) is available. Add the following to your build configuration to include it automatically.

For Maven:

<dependency>

      <groupId>com.timgroup</groupId>

      <artifactId>java-statsd-client</artifactId>

      <version>3.1.0</version>

</dependency>

For Gradle:

compile group: 'com.timgroup', name: 'java-statsd-client', version: '3.1.0'

Once the client library is imported, instantiate the StatsDClient interface with the implementation class NonBlockingStatsDclient, which provides the desired prefix, hostname, and port on which the StatsD server is running.

As shown below, this interface allows you to use simple methods such as time () and incrementCounter () to send Graphite to the StatsD server. See [https: //github.com/tim-group/java-statsd-client](https: //github.com/tim-group/java-statsd-client) for complete documentation.

package example.statsd;

import com.timgroup.statsd.NonBlockingStatsDClient;
import com.timgroup.statsd.StatsDClient;

public class App {

   private static final StatsDClient statsd = new NonBlockingStatsDClient("java.statsd.example.prefix", "localhost", 8125);

   public static void main(String[] args) {
       statsd.incrementCounter("java_main_method_counter");
       statsd.time("java_method_time", 125L);

       statsd.stop();
   }
}

Statistics D horizontal scaling

When it comes to infrastructure, a single StatsD server can't handle all the load and ultimately requires horizontal scaling. Horizontal scaling with StatsD is not a simple round robin load balancing, as you can also perform aggregations with StatsD. If a metric with the same key is distributed across multiple nodes, a single StatsD cannot accurately aggregate the entire metric.

Therefore, the author of SatsD has released a StatsD cluster proxy that uses a consistent hash to ensure that the same metric is always sent to the same instance.

Below is a very simple configuration of the StatsD cluster proxy.

{
    nodes: [
    {host: 'host1', port: 8125, adminport: 8128},
    {host: 'host2', port: 8125, adminport: 8130},
    {host: 'host3', port: 8125, adminport: 8132}
    ],
    server: './servers/udp',
    host:  '0.0.0.0',
    port: 8125,
    mgmt_port: 8126,
    forkCount: 4,
    checkInterval: 1000,
    cacheSize: 10000
    }

Once the config file is set up, just run:

node proxy.js proxyConfig.js

Proxy.js is located at the root of the StatsD installation directory.

Some of the configuration keys are worthy of explanation:

-** CheckInterval : Determines the health check interval. If the node is offline, the cluster proxy removes the node from the configuration. - Server **: Server binaries are read from the node configuration specified in the "node" configuration.

Summary

StatsD and Graphite are ideal for monitoring infrastructure. All the above code and configuration is available at the github repository.

The main advantages are:

-** Low Memory Footprint **: StatsD is a very simple node.js based server, which results in a very low memory footprint. This means that your infrastructure can easily get started with this setup.

-** Efficient network **: StatsD can operate over UDP, a protocol with few connections, so large amounts of data can be transferred in a very short time.

If you would like to try these processes, please reserve a demo for MetricFire in Japanese (https://www.metricfire.com/demo/?utm_source=blog&utm_medium=Qiita&utm_campaign=Japan&utm_content=Monitoring%20your%20infrastructure%20with] % 20StatsD% 20and% 20Graphite) Please. You can also talk about the best monitoring solution.

See you in another article!

Recommended Posts

Infrastructure monitoring using Graphite and StatsD
Signing and validation using java.security.Provider
Animation using matchedGeometryEffect and @Namespace
How to install and configure the monitoring tool "Graphite" on Ubuntu