We actively use monit for process monitoring and automatic recovery in the server, and use Slack as a team collaboration tool. The default notification for monit was email, which could slow down the situation, so I decided to use Slack's Incoming Webhook to connect to Slack's channel. Since I investigated the specifications of monit for implementation, I will also introduce some Python scripts and monit configs. (Since the config of monit is almost the same, I feel that I didn't have to write it)
I made Python 3.5.6 for a script called from monit, but since it is necessary to build a routine environment to use Python 3 system on CentOS 6.x, first prepare the environment. Please read the necessary directories as appropriate and place them wherever you like.
[root @ monit scripts] # yum install bzip2-devel # I forgot, but I'm sure I needed it. You may not need it.
[root@monit ~]# curl -L https://raw.githubusercontent.com/pyenv/pyenv-installer/master/bin/pyenv-installer | bash
[root@monit ~]# pip install --upgrade pip
[root@monit ~]# echo 'export PATH="$HOME/.pyenv/bin:$PATH"' >> ~/.bashrc
[root@monit ~]# echo 'eval "$(pyenv init -)"' >> ~/.bashrc
[root@monit ~]# echo 'eval "$(pyenv virtualenv-init -)"' >> ~/.bashrc
[root@monit ~]# source ~/.bashrc
[root@monit ~]# pyenv -v
pyenv 1.2.7
[root@monit ~]# pyenv install --list
[root@monit ~]# pyenv install 3.5.6
[root@monit ~]# mkdir /usr/local/scripts
[root@monit ~]# cd /usr/local/scripts
[root@monit scripts]# mkdir monitSlack
[root@monit monitSlack]# cd monitSlack/
[root@monit monitSlack]# python --version
Python 2.6.6
[root@monit monitSlack]# pyenv local 3.5.6
[root@monit monitSlack]# python --version
Python 3.5.6
In order to make the script cleaner, I read the .env file from the outside and set the environment variables, so install the necessary packages.
[root@monit monitSlack]# pip install python-dotenv
Since I am trying to spit out logs, I placed the directory first. Also place .env, which will be used later, first.
[root@monit monitSlack]# mkdir log
[root@monit monitSlack]# vim .env
[root@monit monitSlack]# cat .env
export SLACK_URL = "https://hooks.slack.com/services/xxxxx/xxxxx/xxxxxxxxxxxxxxxxx" # Rewrite to your own URL as appropriate
export SLACK_CHANNEL = "# systemalert" # Rewrite to your own channel as appropriate
export SLACK_USER = "monit" # Rewrite as appropriate (it's cute if you make it a monit user and make it a monit dog icon)
export SLACK_ICON=":icon:"
Finally the actual script. I will write a Python script by referring to the articles on the net.
[root@monit monitSlack]# vim monitNotifyToSlack.py
Contents is like this.
!/root/.pyenv/shims/python
import urllib.request
import json
import os
import datetime
import logging
from dotenv import load_dotenv
load .env file
load_dotenv()
Logging settings
formatter = '%(levelname)s : %(asctime)s : %(message)s'
logdir = os.path.dirname(os.path.abspath(__file__))
logfile = logdir + "/log/logger.log"
logging.basicConfig(filename=logfile, level=logging.DEBUG, format=formatter)
logging attributes
logging.debug("==================== START ====================")
logging.debug("Slack notification script started.")
logging.debug("MOINT_HOST : " + os.getenv('MONIT_HOST'))
logging.debug("MOINT_SERVICE : " + os.getenv('MONIT_SERVICE'))
logging.debug("MOINT_DESCRIPTION : " + os.getenv('MONIT_DESCRIPTION'))
if __name__ == "__main__":
# set script attributes
surl = os.getenv('SLACK_URL')
schannel = os.getenv('SLACK_CHANNEL')
suser = os.getenv('SLACK_USER')
sicon = os.getenv('SLACK_ICON')
mhost = os.getenv('MONIT_HOST')
mservice = os.getenv('MONIT_SERVICE')
mdesc = os.getenv('MONIT_DESCRIPTION')
curDate = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
colorAlert = "#ff0000"
colorRecover = "#008000"
if 'not running' in mdesc or 'failed' in mdesc or 'resource limit' in mdesc:
titleText = "[" + curDate + "] " + mhost + " - " + mservice
color = colorAlert
descMessage = mservice + " " + mdesc
else:
titleText = "[" + curDate + "] " + mhost + " - " + mservice
color = colorRecover
descMessage = mservice + " " + mdesc
# Set context to Slack webhook notification format.
data = {
"channel" : schannel,
"username" : suser,
"iconi_emoji" : sicon,
}
data['text'] = titleText
data['attachments'] = []
data['attachments'].append({})
data['attachments'][0]['color'] = color
data['attachments'][0]['fields'] = []
data['attachments'][0]['fields'].append({})
data['attachments'][0]['fields'][0]['title'] = "Monit Alert on " + mhost
data['attachments'][0]['fields'][0]['value'] = descMessage
logging.debug("WEBHOOK_DATA : " + str(data))
headers = {
'Content-Type': 'application/json'
}
req = urllib.request.Request(surl, json.dumps(data).encode(), headers)
with urllib.request.urlopen(req) as res:
result = res.read()
logging.debug("request sent to " + surl)
logging.debug("result : " + str(result))
logging.debug("==================== END ====================")
I'm trying to log webhook requests to Slack in log / logger.log. I think it's a good idea to compare it with the monit log and use it to check if it was called from monit but it is not executed, or if the POSTed data is correct. Don't forget to give execute permission after creation.
[root@monit monitSlack]# chmod +x monitNotifyToSlack.py
[root@monit monitSlack]# ls -l monitNotifyToSlack.py
-rwxr-xr-x 1 root root 2553 Mar 9 16:20 monitNotifyToSlack.py
Let's run it manually once in the test. Since it is actually called from monit, it is executed with the environment variable of monit, so this time I will set the environment variable in my shell and execute it.
[root@monit monitSlack]# export MONIT_HOST=monitslacktest.com
[root@monit monitSlack]# export MONIT_SERVICE=testservice
[root@monit monitSlack]# export MONIT_DESCRIPTION="monit test failed"
[root@monit monitSlack]# ./monitNotifyToSlack.py
If successful, you should get a notification in Slack like this: By the way, you should see the following output in the log.
[root@monit monitSlack]# less log/logger.log
DEBUG : 2020-03-10 10:54:27,635 : ==================== START ====================
DEBUG : 2020-03-10 10:54:27,635 : Slack notification script started.
DEBUG : 2020-03-10 10:54:27,635 : MOINT_HOST : monitslacktest.com
DEBUG : 2020-03-10 10:54:27,635 : MOINT_SERVICE : testservice
DEBUG : 2020-03-10 10:54:27,635 : MOINT_DESCRIPTION : monit test failed
DEBUG : 2020-03-10 10:54:27,635 : WEBHOOK_DATA : {'username': 'monit', 'attachments': [{'color': '#ff0000', 'fields': [{'title': 'Monit Alert on monitslacktest.com', 'value': 'testservice monit test failed'}]}], 'channel': '#systemalert', 'iconi_emoji': ':icon:', 'text': '[2020-03-10 10:54:27] monitslacktest.com - testservice'}
DEBUG : 2020-03-10 10:54:28,571 : request sent to https://hooks.slack.com/services/xxxxx/xxxxxx/xxxxxxxxxxxxxxxxxxxxx
DEBUG : 2020-03-10 10:54:28,571 : result : b'ok'
DEBUG : 2020-03-10 10:54:28,571 : ==================== END ====================
Let's also check the processing assuming recovery. Please check the log if necessary.
[root@monit monitSlack]# export MONIT_DESCRIPTION="monit test process is running with pid 27554"
[root@monit monitSlack]# ./monitNotifyToSlack.py
If all goes well, you should get a green notification this time. In the process, we check the value of the environment variable MONIT_DESCRIPTION, and if it contains failed, not running, or something that is often seen when monit issues an alert, the color is red, otherwise it is green.
I will write a monitoring script for monit. It seems that monit can manage it wisely if it executes restart, but when calling a script, it is necessary to use exec. If you want to process script execution and process restart together, or if you want to unmonitor by seeing the reaction after restarting using exec, you need to write the instructions one by one. The config is a little complicated.
I will omit it. Please insert it appropriately from epel-release. Below, proceed assuming that monit is installed under / etc / with the following configuration.
I will omit the details, but please take care of the following
monit config Let's write it. It's long, so pick up only where you need it. Redundant. Earlier, I renamed the logging file because logging wasn't read by *-> * .conf.
[root@monit monit.d]# cd /etc/monit.d
[root@monit monit.d]# mv logging logging.conf
Nginx The monitoring requirements are as follows. Only the first commentary. After that, it's almost the same, so I'll omit it.
[root@monit monit.d]# vim nginx.conf
[root@monit monit.d]# cat nginx.conf
check process nginx with pidfile /var/run/nginx.pid
start program = "/etc/init.d/nginx start" with timeout 30 seconds
stop program = "/etc/init.d/nginx stop"
if failed port 80 protocol http
and request "/"
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1 & /etc/init.d/nginx restart'"
repeat every 10 cycles
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
if does not exist
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1 & /etc/init.d/nginx restart'"
repeat every 1 cycles
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
if does not exist for 4 times within 5 cycles then unmonitor
[root@monit monit.d]# vim host.conf
[root@monit monit.d]# cat host.conf
check system $HOST
if loadavg (1min) > 8
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
if loadavg (5min) > 4
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
if cpu usage > 80% for 3 cycles
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
if memory usage > 75%
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
if swap usage > 50%
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
crond
[root@monit monit.d]# vim crond.conf
[root@monit monit.d]# cat crond.conf
check process crond with pidfile /var/run/crond.pid
start program = "/etc/init.d/crond start" with timeout 30 seconds
stop program = "/etc/init.d/crond stop"
if does not exist
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1 & /etc/init.d/crond restart'"
repeat every 1 cycles
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
if does not exist for 4 times within 5 cycles then unmonitor
[root@monit monit.d]# cat disk.conf
check device xvda1 with path /
if SPACE usage > 80%
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
jetty I think tomcat can do the same.
[root@monit monit.d]# vim jetty.conf
[root@monit monit.d]# cat jetty.conf
start program = "/etc/init.d/jetty start" with timeout 30 seconds
stop program = "/etc/init.d/jetty stop"
if failed port 8080 protocol http with timeout 15 seconds
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1 & /etc/init.d/jetty restart'"
repeat every 1 cycles
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
if does not exist
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1 & /etc/init.d/jetty restart'"
repeat every 1 cycles
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
if does not exist for 4 times within 5 cycles then unmonitor
MySQL
[root@monit monit.d]# vim mysqld.conf
[root@monit monit.d]# cat mysqld.conf
check process mysqld with pidfile /var/run/mysqld/mysqld.pid
start program = "/etc/init.d/mysqld start" with timeout 30 seconds
stop program = "/etc/init.d/mysqld stop"
if failed port 3306 protocol mysql with timeout 10 seconds
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1 & /etc/init.d/mysqld restart'"
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
if does not exist
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1 & /etc/init.d/mysqld restart'"
repeat every 1 cycles
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
if does not exist for 4 times within 5 cycles then unmonitor
[root@monit monit.d]# vim mysql-remotedb.conf
[root@monit monit.d]# cat mysql-remotedb.conf
check host mysql-remotedb with address 172.xxx.xxx.xxx #Please rewrite IP as appropriate
if failed port 3306 protocol mysql with timeout 10 seconds
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
ntpd I think chrony can be done in the same way.
[root@monit monit.d]# vim ntpd.conf
[root@monit monit.d]# cat ntpd.conf
check process ntpd with pidfile /var/run/ntpd.pid
start program = "/etc/init.d/ntpd start" with timeout 30 seconds
stop program = "/etc/init.d/ntpd stop"
if does not exist
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1 & /etc/init.d/ntpd restart'"
repeat every 1 cycles
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
if does not exist for 4 times within 5 cycles then unmonitor
sendmail
[root@monit monit.d]# vim sendmail.conf
[root@monit monit.d]# cat sendmail.conf
check process sendmail with pidfile /var/run/sendmail.pid
start program = "/etc/init.d/sendmail start" with timeout 30 seconds
stop program = "/etc/init.d/sendmail stop"
if does not exist
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1 & /etc/init.d/sendmail restart'"
repeat every 1 cycles
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
if does not exist for 4 times within 5 cycles then unmonitor
sshd
[root@monit monit.d]# vim sshd.conf
[root@monit monit.d]# cat sshd.conf
check process sshd with pidfile /var/run/sshd.pid
start program = "/etc/init.d/sshd start" with timeout 30 seconds
stop program = "/etc/init.d/sshd stop"
if does not exist
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1 & /etc/init.d/sshd restart'"
repeat every 1 cycles
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
if does not exist for 4 times within 5 cycles then unmonitor
vsftpd By default vsftpd does not generate a pid file. There seems to be a way to do it, but I'll skip it and just monitor the service.
[root@monit monit.d]# vim vsftpd.conf
[root@monit monit.d]# cat vsftpd.conf
check host vsftpd with address 127.0.0.1
start program = "/etc/init.d/vsftpd start" with timeout 30 seconds
stop program = "/etc/init.d/vsftpd stop"
if failed port 21 protocol ftp with timeout 15 seconds
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1 & /etc/init.d/vsftpd restart'"
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
zabbix-agent
[root@monit monit.d]# cat zabbix-agent.conf
check process zabbix-agent with pidfile /var/run/zabbix/zabbix_agentd.pid
start program = "/etc/init.d/zabbix-agent start" with timeout 30 seconds
stop program = "/etc/init.d/zabbix-agent stop"
if does not exist
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1 & /etc/init.d/zabbix-agent restart'"
repeat every 1 cycles
else if succeeded
then exec "/bin/bash -c '/usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'"
if does not exist for 4 times within 5 cycles then unmonitor
Let's reload monit and check the reflection status. It's OK if monit can be read exactly as specified when -Iv is done with the following feeling. After that, please check if monit can restart automatically and if unmonitor works properly.
[root@monit monit.d]# service monit reload
Reloading monit: Reinitializing monit daemon
[root@monit monit.d]# monit -Iv
...
Process Name = mysqld
Pid file = /var/run/mysqld/mysqld.pid
Monitoring mode = active
On reboot = start
Start program = '/etc/init.d/mysqld start' timeout 30 s
Stop program = '/etc/init.d/mysqld stop' timeout 30 s
Existence = if does not exist for 4 times within 5 cycles then unmonitor
Existence = if does not exist then exec '/bin/bash -c /usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1 & /etc/init.d/mysqld restart' repeat every 1 cycle(s) else if succeeded then exec '/bin/bash -c /usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'
Port = if failed [localhost]:3306 type TCP/IP protocol MYSQL with timeout 10 s then exec '/bin/bash -c /usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1 & /etc/init.d/mysqld restart' else if succeeded then exec '/bin/bash -c /usr/local/scripts/monitSlack/monitNotifyToSlack.py >> /usr/local/scripts/monitSlack/log/monitlog.log 2>&1'
...
That's it.
Recommended Posts