[PYTHON] [blackbird-elasticache] Monitoring AWS ElastiCache (redis)

This Plugin is a plugin that uses the CloudWatch API to get the Metric of AWS ElastiCache. So far (I'm planning to implement memcached in the future), only redis is supported (or the project I belong to uses only redis), but Introducing blackbird-elasticache.

What metric does it get?

First, let's start with the list of Metrics that can be obtained. Statistics is, as always, how to calculate Average, Sum, Max, or Min. ElastiCache's CloudWatch is the middleware's own (in this case, redis or memcached) Metric There is a metric on the host side where the middleware is installed.

Host Side Metrics(Called Host-level metrics by Amazon)

Metric Name Statistics Detail
CPUUtilization Average CPU usage. It doesn't seem to be that expensive when using KVS.
SwapUsage Average Swap usage rate.
FreeableMemory Average Available memory capacity on the host side
NetworkBytesIn Average Traffic amount for this host(in)is.
NetworkBytesOut Average Traffic amount for this host(out)is.

about SwapUsage and FreeableMemory

Although it is said to be fully managed, I have the impression that redis is used directly for ElastiCache.

Freeable Memory and Swap Usage are items that I want to be especially careful about, Riding on the word fully managed, you may be consuming a lot of memory other than adding a cancer key. Since it is KVS, I think that Memory can be used to the limit, but it is very dangerous when swapping.

I'm sorry for what I experienced, but there was a problem that the response of redis was slow only for a specific time. As a result, I wrote too much, and only during the time when I was swapping because I swapped in at the timing of bgsave It was super late. Therefore, I think it is important to get Swap Usage on a regular basis.

Redis Metrics

Metric Name Statistics Detail
CurrConnections Average, Maximum Current number of connections. max_Be careful not to hit the connections!!
Evictions Average, Maximum Number of values vibrated before the LRU deadline
Reclaimed Average, Maximum The sum of all LRUs expired and the values deleted after reaching the memory limit
NewConnections Average, Maximum Number of connections accepted within the acquisition time interval
BytesUsedForCached Maximum Amount of memory allocated by redis
CacheHits Average, Maximum Number of hits as cache
CacheMisses Average, Maximum Number of misses as cache
LepricationLag Average, Maximum Read replica delayed seconds(Only read reolica)
GetTypeCmds Maximum Total number of Get queries
SetTypeCmds Maximum Total number of Set queries
KeyBasedCmds Maximum -
StringBasedCmds Maximum -
HashBasedCmds Maximum -
ListBasedCmds Maximum -
SetBasedCmds Maximum -
SortedSetBasedCmds Maximum -

about Evictions and Reclaimed

Evictions is the number of keys deleted before expire at the upper limit of maxmemory. Reclaimed, on the other hand, is the total number of values deleted. So, when the number of Reclaimed --Evictions is large, either there is not enough memory on the host side or there are a lot of useless objects.

Zabbix Template

Items

Only the values that can be obtained with the above CloudWatch API. (I'm sorry for the hurry.)

Graphs

There are the following graphs, and only the drawing of the graph that seems to be interesting is captured. (Normal line graphs and those with only one element should be omitted)

CPU Utilization

CPU usage on the host side.

Memory Usage

Freeable Memory on the host side and Used Memory on the Redis side are stacked. スクリーンショット_2014-12-22_2_42_15.png

Network Traffic

In and Out of the network. スクリーンショット_2014-12-22_2_42_29.png

Cache Hits/Miss

It is a stacked graph of Hits and Misses of Cache.

Current Items

The current number of items (a set of Key and Value).

Evictions

The number of items that have been Evictions.

Reclaimed

The number of items that are reclaimed.

New Connections

The number of Connections established per unit time.

CMDs

This is a stacked graph of the list of commands. スクリーンショット_2014-12-22_2_42_49.png

Summary

This plugin only takes CloudWatch Metric, but since ElastiCache is plain redis and plain memcached, blackbird-redis and [blackbird-memcached] ](Http://qiita.com/makocchi/items/d178038588465ec8ba07) may be better to see various values directly. If anything, it might be better to use both plug-ins to make it delicious (I want to do that).

Recommended Posts

[blackbird-elasticache] Monitoring AWS ElastiCache (redis)
[blackbird-dynamodb] Monitoring AWS DynamoDB
[blackbird-rds] Monitoring AWS RDS
[blackbird-sqs] Monitoring AWS SQS
[blackbird-elb] Monitoring AWS ElasticLoadBalancing
[blackbird-aws-service-limits] Monitoring AWS Service Limits
[blackbird-kinesis-stream] Monitoring AWS Kinesis Stream