Monitor Python web apps with Prometheus

image.png

In this article, a simple [Flask](http :) running on uWSGI + nginx //flask.pocoo.org/) I'll show you how to do that with an example that makes your application fully functional in monitoring.

A little history

Prometheus was originally a monitoring tool derived from Google's Borgmon.

In a native environment, Borgmon relies on ubiquitous and easy service discovery. The monitored services are managed by Borg and should be easy to find. An example would be all jobs running in a cluster of specific users, or for more complex deployments, there would be all the subtasks that make up the job together.

Each of these could be a single target for Borgmon to retrieve data from the `/ varz``` endpoint, similar to Prometheus' `/ metrics```, respectively. It is usually a multithreaded server written in C ++, Java, Go, or (less commonly) Python.

Prometheus inherits many of Borgman's assumptions about its environment. In particular, the client library assumes that the metrics are from different libraries and subsystems of multiple threads running in a shared address space, and on the server side, Prometheus has one target. Suppose you are (probably) a multithreaded program.

Don't do it this way

These assumptions have been broken in many non-Google deployments, especially in the Python world. Here we distribute the request to multiple workers (using, for example, Django or Flask). WSGI It's common to run on an application server (each worker is a process, not a thread).

In a simple deployment of the Prometheus Python client for Flask apps running under uWSGI, each request from the Prometheus server to `` `/ metrics``` hits a different worker process, each with its own counter, Export histograms and more. Therefore, as a result, the monitoring data becomes garbage.

In fact, each scrape on a particular counter returns the value of one worker, not the entire job. The value should jump around and tell you that nothing is useful for the whole application.

solution

Amit Saha has the same problem and Various Solutions -with-prometheus.html) is discussed in a detailed article. As described in one of the articles, the Prometheus Python client includes a multi-process mode intended to handle this situation, and an example of application server motivation is gunicorn.

This works by sharing the directory mmap ()'d dictionaries between all the processes in your application. To do. Each process then performs a calculation to return a shared view of the application-wide metrics when scraped by Prometheus.

This has some "heading" disadvantages mentioned in the documentation. Examples include the lack of free Python metrics per process, the lack of full support for certain metric types, and the slightly more complex gauge types.

End-to-end configuration is also difficult. Here's what you need and how you've achieved each part in your environment. Hopefully this complete example will be useful to anyone who will do similar work in the future.

  1. The shared directory must be passed to the process as the environment variable `` `prometheus_multiproc_dir```. Pass it using the uWSGI env option. See uwsgi.ini.

  2. The client shared directory must be cleared after the application is restarted. This is a bit tricky to understand, but in uWSGI's Hardcoded Hooks Use one exec-asap to run a shell script immediately after reading the configuration file and before doing anything else (https://uwsgi-docs.readthedocs. io / en / latest / Hooks.html # exec-run-shell-commands). See uwsgi.ini. This script deletes and recreates the shared data directory of the Prometheus client (https://github.com/hostedgraphite/pandoras_flask/blob/master/bin/clear_prometheus_multiproc). To verify proper permissions, run uwsgi as `` `root``` under Supervisor (http://supervisord.org/) and remove privs in uwsgi (https:: //github.com/hostedgraphite/pandoras_flask/blob/master/conf/uwsgi.ini#L18).

  3. The application must set the multi-process mode for the Python client. This was done primarily via Documents. (https://github.com/prometheus/client_python#multiprocess-mode-gunicorn). See metrics.py. It also includes a neat middleware that exports Prometheus metrics for response status and latency (https://github.com/hostedgraphite/pandoras_flask/blob/master/pandoras_flask/metrics.py#L17). Please note that

  4. uWSGI needs to configure the application environment so that the application is loaded after fork (). By default, uWSGI attempts to save memory by loading the application and then `fork ()`. This certainly has the advantage of copy-on-write (https://en.wikipedia.org/wiki/Copy-on-write), which can save a lot of memory. However, it seems to interfere with the client's multi-process mode operation. Probably because Lock before fork () is applied like this. uWSGI's lazy-apps option You can use it to load the application after forking, resulting in a cleaner environment.

These make the Flask app `` `/ metrics``` endpoints running under uWSGI work, and fully functional in the pandoras_flask demo. I think you can do it.

Note that the demo exposes the metric endpoint for another port (https://github.com/hostedgraphite/pandoras_flask/blob/master/conf/nginx.conf#L28) to the appropriate app. please. This makes it easy to grant access to monitoring without the user having access to it.

In your deployment, you should be able to use uwsgi_exporter to get more stats from uWSGI itself.

Feature

In Saha's blog post (https://echorand.me/your-options-for-monitoring-multi-process-python-applications-with-prometheus.html), the recommended solution is local [statsd]( It describes a set of alternatives by pushing metrics via https://github.com/etsy/statsd). That's not really the way we like it.

Ultimately, running everything under a container orchestration like kubernetes provides a native environment where Prometheus shines, which is the existing Python. It's a big step towards gaining other benefits in the application stack.

Perhaps the most promethean intermediate step is to register each subprocess individually as a scraping target. This is [django-prometheus](https://github.com/korfuri/django-prometheus/blob/master/documentation/exports.md#exporting-metrics-in-a-wsgi-application-with-multiple-processes- This is the approach adopted by per-process), but the proposed "port range" approach is a bit more complicated.

In our environment, you can implement this idea in the following ways (you may still have it):

  1. Run the web server within the thread of each process, listen on the temporary port, and process the / metrics query.
  2. Register your web server and periodically update its address (eg hostname: 32769) with a short TTL etcd path. I already use etcd for most service discovery needs.
  3. Use file based service discovery in Prometheus to identify these targets and rub them as individuals.

I don't think this approach is more complicated than using the multi-process mode of the Python client, but it comes with its own complexity.

Keep in mind that having one target per worker causes a chronological explosion. For example, in this case, a single default histogram metric that tracks response times from Python clients across eight workers will generate about 140 individual time series before multiplying by other labels to include. This is not a problem that Prometheus handles, but be aware that scaling can increase (or increase).

Summary

For the time being, exporting metrics from the standard Python web app stack to Prometheus is a bit complicated in any way. We hope this post will be useful to anyone who wants to get started with the existing nginx + uwsgi + Flask app.

When running more services under container orchestration (what we are trying to do), we expect it to be easier to integrate Prometheus monitoring with them.

Prometheus's well-established users are [Hosted Prometheus](https://try.metricfire.com/japan/?utm_source=blog&utm_medium=Qiita&utm_campaign=Japan&utm_content=Pandora's%20Flask%3A%20Monitoring%20a%20Python%20web%20app% We recommend that you take a look at the services of 20with% 20Prometheus). [Demo](https://calendly.com/metricfire-chatwithus/chat?utm_source=blog&utm_medium=Qiita&utm_campaign=Japan&utm_content=Pandora's%20Flask%3A%20Monitoring%20a%20Python%20web%20app%20with%20Prometheus required) Please do not hesitate to tell us.

Recommended Posts

Monitor Python web apps with Prometheus
Monitor Python apps with New Relic's APM (Flask)
Web scraping with python + JupyterLab
Web API with Python + Falcon
Web application with Python + Flask ② ③
Web scraping beginner with python
Streamline web search with python
Web application with Python + Flask ④
Web scraping with Python ① (Scraping prior knowledge)
Getting Started with Python Web Applications
Web scraping with Python First step
I tried web scraping with python.
Monitor Python application performance with Dynatrace ♪
Get web screen capture with python
Efficiently develop Azure Python apps with CI/CD
Monitor Mojo outages with Python and Skype
Getting Started with Flask with Azure Web Apps
Getting Started with Python Web Scraping Practice
Daemonize a Python web app with Supervisor
Develop Windows apps with Python 3 + Tkinter (Preparation)
[Personal note] Web page scraping with python3
Web scraping with Python ② (Actually scraping stock sites)
Download files on the web with Python
Horse Racing Site Web Scraping with Python
Monitor web page updates with LINE BOT
[Python] A quick web application with Bottle!
Getting Started with Python Web Scraping Practice
Easy web app with Python + Flask + Heroku
Run a Python web application with Docker
Let's make a web framework with Python! (1)
Practice web scraping with Python and Selenium
Easy web scraping with Python and Ruby
Let's make a web framework with Python! (2)
[For beginners] Try web scraping with Python
FizzBuzz with Python3
Scraping with Python
Statistics with python
Scraping with Python
Python with Go
Twilio with Python
Integrate with Python
Play with 2016-Python
AES256 with python
Tested with Python
python starts with ()
with syntax (Python)
Bingo with python
Zundokokiyoshi with python
Excel with Python
Microcomputer with Python
Cast with python
AWS-Perform web scraping regularly with Lambda + Python + Cron
Python Web Content made with Lolipop cheap server
Introduction to Tornado (1): Python web framework started with Tornado
[Web development with Python] query is also redirect
Start a simple Python web server with Docker
Run Python web apps on NGINX + NGINX Unit + Flask
Develop Windows apps with Python 3 + Tkinter (exe file)
[python] Quickly fetch web page metadata with lassie
[Improved version] Script to monitor CPU with Python
Launch a web server with Python and Flask