Microservices in Python (Overview)

1.First of all

Recently, I had the opportunity to investigate the development of microservices (so-called REST APIs) using Lightweight Language (LL). There are various languages such as Ruby, Go, Python, etc. even if it is called LL, but since there were many acquaintances who use Python in the area of machine learning and IoT, this time about how to implement microservices using Python I would like to keep it as a memorandum. This is my first time to read Python, so I hope you read it with that in mind.

1.1 Verification environment

The environment used in this verification is as follows. I haven't tried the Python virtual environment this time. (Because it is said that it does not work on Windows)

1.2 Library to use

This time, it is a library used to create a macro service in Python. All are simple libraries that are easy to use and can all be installed with "pip install".

Item number Library Overview site
1 flask Lightweight mvc framework http://flask.pocoo.org/
2 peewee O/R mapping http://docs.peewee-orm.com/en/latest/
3 psycopg2 Required for PostgreSQL access https://pypi.python.org/pypi/psycopg2
4 cerberus Input check library http://docs.python-cerberus.org/en/stable/
5 requests HTTP client http://docs.python-requests.org/en/master/

2 Environment construction

2.1 Python installation

I installed it according to the installer. The installation destination is the default C: \ Python27.

2.2 Pass

The following two places have been added to the environment variable path. (For reference, the path delimiter in windows is; )

2.3 Setting environment variables to exceed Proxy

If Proxy is required for Internet access due to the network configuration of the organization, set the environment variable to exceed Proxy.

It can be either a user environment variable or a system environment variable. If you want to set it only when installing pip, we recommend you to set it temporarily with set.

set HTTPS_PROXY=userid:[email protected]:8080
set HTTP_PROXY=userid:[email protected]:8080

Match the Proxy host, port number, user ID, and password to your environment. If the Proxy does not require user authentication, it does not need to precede @.

2.4 Set the URL that you do not want to go through Proxy as an environment variable

There may be URLs that you do not want to go through Proxy, such as when you want to access a server on a private network. In that case, use the NO_PROXY environment variable.

set NO_PROXY=127.0.0.1;localhost;apserver

When trying to access the server prepared by yourself using requests For some reason I couldn't access it via Proxy, so I decided to set the NO_PROXY environment variable.

2.5 Install the required libraries with pip

Again, I haven't tried the Python virtual environment this time. So I installed it directly with pip.

pip install peewee
pip install psycopg2
pip install requests
pip install cerberus
pip install flask

3 concept application

Let's implement an application that introduces the minimum usage of the libraries (flask, cerberus, peewee) introduced this time.

This time, we will not consider functional division or file division according to the responsibility. For better visibility, we will use one source code for the microservice web application and one source code for the client that uses the microservice.

3.1 Concept App Overview

It is a microservice (or REST API) application that has an API that just adds records to the question table shown below.

create_table.sql


DROP TABLE IF EXISTS question;

CREATE TABLE question (
    question_code   VARCHAR(10)    NOT NULL,
    category        VARCHAR(10)    NOT NULL,
    message         VARCHAR(100)   NOT NULL,
    CONSTRAINT question_pk PRIMARY KEY(question_code)
);

The application specifications and restrictions are as follows. It seems that the specifications have mixed purposes, but the following are the points of implementation.

3.2 Demo microservices

3.2.1 Source code

demoapp.py


# -*- coding: utf-8 -*-
import os
from flask import Flask, abort, request, make_response, jsonify
import peewee
from playhouse.pool import PooledPostgresqlExtDatabase
import cerberus

# peewee
db = PooledPostgresqlExtDatabase(
    database = os.getenv("APP_DB_DATABASE", "demodb"),
    host = os.getenv("APP_DB_HOST", "localhost"),
    port = os.getenv("APP_DB_PORT", 5432),
    user = os.getenv("APP_DB_USER", "postgres"),
    password = os.getenv("APP_DB_PASSWORD", "postgres"),
    max_connections = os.getenv("APP_DB_CONNECTIONS", 4),
    stale_timeout = os.getenv("APP_DB_STALE_TIMEOUT", 300),
    register_hstore = False)

class BaseModel(peewee.Model):
    class Meta:
        database = db

# model
class Question(BaseModel):
    question_code = peewee.CharField(primary_key=True)
    category = peewee.CharField()
    message = peewee.CharField()

# validation schema for cerberus
question_schema = {
    'question_code' : {
        'type' : 'string',
        'required' : True,
        'empty' : False,
        'maxlength' : 10,
        'regex' : '^[0-9]+$'
    },
    'category' : {
        'type' : 'string',
        'required' : True,
        'empty' : False,
        'maxlength' : 10
    },
    'message' : {
        'type' : 'string',
        'required' : True,
        'empty' : False,
        'maxlength' : 100
    }
}

# flask
app = Flask(__name__)

# rest api
@app.route('/questions', methods=['POST'])
def register_question():
    #Check the input
    v = cerberus.Validator(question_schema)
    v.allow_unknown = True
    validate_pass = v.validate(request.json)
    if not validate_pass:
        abort(404)
    
    #Calling business logic
    result = register(request.json)
    #Return of processing result (JSON format)
    return make_response(jsonify(result))

# error handling
@app.errorhandler(404)
def not_found(error):
    return make_response(jsonify({'error' : 'Not Found'}), 404)

@app.errorhandler(500)
def server_error(error):
    return make_response(jsonify({'error' : 'ERROR'}), 500)

#Business logic
@db.atomic()
def register(input):
    # create instance
    question = Question()
    question.question_code = input.get("question_code")
    question.category = input.get("category")
    question.message = input.get("message")
    # insert record using peewee api
    question.save(force_insert=True)
    result = {
        'result' : True,
        'content' : question.__dict__['_data']
    }
    return result

# main
if __name__ == "__main__":
    app.run(host=os.getenv("APP_ADDRESS", 'localhost'), \
    port=os.getenv("APP_PORT", 3000))

3.2.2 DB settings in peewee

3.2.2.1 Corresponding DB and connection pool

With peewee, you can access DB such as SQLLite, MySQL, PostgreSQL. See http://docs.peewee-orm.com/en/latest/peewee/database.html for more information.

Since microservices are of course web applications, I think the connection pool function will be essential. Of course, peewee also supports connection pools. The settings are different for each DB, and in the case of PostgreSQL, use the PooledPostgresqlExtDatabase class.

When using the PooledPostgresqlExtDatabase class, an error will occur at runtime if psycopg2 is not installed.

Although it has nothing to do with the peewee function, it is recommended that the DB connection information (host, port number, user ID, password, etc.) be set from outside the program (environment variables) because it depends on the environment.

3.2.2.2 PostgreSQL HSTORE function

When using PostgreSQL with peewee, it is assumed that the function called HSTORE of PostgreSQL is used by default. Therefore, if HSTORE is not enabled in the database to be used, an error will occur in peewee's DB access.

As a countermeasure, enable HSTORE with CREATE EXTENSION hstore; in the database, Alternatively, you need to set register_hstore = False to prevent peewee from using HSTORE.

3.2.2.3 Definition of Base Model

In order to use peewee's O / R mapping function, define a class that inherits peewee's Model. This time it is a class called BaseModel, but this class needs to define the Meta class internally and set the above-mentioned DB definition object in the database field.

3.2.3 peewee model definition

peewee's O / R mapping feature associates tables with classes and columns with fields. Then the peewee model object is mapped to one record in the table.

The peewee model is defined as a class that inherits the BaseModel mentioned above. The field sets the field type of peewee according to the data type of the table.

See http://docs.peewee-orm.com/en/latest/peewee/models.html#fields for more information.

Set primary_key = True in the primary key field. If you do not set primary_key, peewee works on the assumption that there is a primary key column called id. Therefore, of course, if the column id does not exist in the table, an error will occur.

If the field name and column name are different, explicitly set the column with db_column = column name.

3.2.4 Request mapping with flask

To create a flask application, first create an instance of flask with __name__ as an argument. After that, use the route () function decorator to associate the HTTP request with the function by setting the request mapping by flask in the defined function.

For those who have experience using the Spring Framework (Spring MVC) in Java, it may be easier to think of the @RequestMapping annotation. (Flask doesn't have mapping functionality in request parameters or headers)

@app.route('/questions/<string:question_code>', methods=['GET'])
def reference_question(question_code):

See http://flask.pocoo.org/docs/0.12/quickstart/#routing for more information on request mapping.

3.2.5 Flask error handling

falsk provides a ʻerrorhandler ()` function decorator for setting error handling. Set the HTTP status code to be handled in the argument. The demo app now handles both 404 and 500.

3.2.6 Input check by cerberus

3.2.6.1 Schema definition of input check rule

In cerberus, input check rules are called schemas. (Maybe it determines the data structure, so I think it's called a schema like DB)

The schema is defined by the Python dictionary type. I didn't know the Python dictionary type, so when I first saw it, I thought I would write it in JSON.

Describe the data type (type), required / not required (required), and input check rules to be applied (maxlength, regex, etc.) for each field. I think it's simple enough to make sense at a glance.

Please refer to http://docs.python-cerberus.org/en/stable/validation-rules.html for the input check rules provided by default.

3.2.6.2 Execution of input check

Input check by cerberus first creates an instance of Validator according to the schema, and uses that instance to perform input check. (There is also a way to use it without creating an instance for each schema, but I will omit it this time)

In the demo app, I defined a schema called question_schema, so I created a Validator instance with this as an argument.

Executing the input check is easy, and the validate method is executed with the dictionary type data that is the target of the input check as an argument. As a return value, True is returned if there is no error in the input check, and False is returned if there is an error.

When using in cooperation with flask, the request data can be accessed in dictionary type (JSON format) with the request.json property, so set this as the argument of the validate method. Since it is a REST application, we will not use it this time, but you can access the data of the input form with request.form.

validate_pass = v.validate(request.json)
validate_pass = v.validate(request.form)

If you just want to check the instance regardless of flask, use __dict__ because the validate method can only take dictionary type arguments.

question_ng = Question()
question_ng.question_code = "abc0123456789xyz"
question_ng.category = None
question_ng.message = ""

validate_pass = v.validate(question_ng.__dict__)

By default, cerberus will result in an input check error if the input data contains fields that are not defined in the schema. To change this behavior, set the Validator instance's ʻallow_unknown property to True. (ʻAllow_unknown = True)

As you can see from "About input check error details" below, ** cerberus Validator instances are stateful ones that hold their state internally. It's not stateless. ** **

That's why the demo app creates a Validator instance for each request. It is necessary to consider the efficiency (reduction) of the instantiation process and the thread-safe Validator.

3.2.6.3 About the error content of the input check error

In cerberus, the error content of the input check error is stored in the ʻerrors` property of the Validator instance. Since it is kept in the property of the instance, it is updated every time it is checked, of course. Now you know that you can't share a Validator instance with multiple threads (it's not thread-safe).

if not validate_pass:
    print v.errors
    abort(404)

In order to check the contents of the errors property, try checking the input data that causes the following error.

Input data that causes an error


question = {
    'question_code' : '99999999990000000',
    'category' : '',
    'message' : 'hello'
}

The content of errors is a dictionary type in which the field in which the error occurred is the key and the array of error messages is the value. Therefore, if multiple input check errors are applied to one field, multiple error messages will be stored.

V displayed.Contents of errors


{u'category': ['empty values not allowed'], u'question_code': ['max length is 10']}

3.2.7 Transaction control in peewee

peewee provides a function decorator for transaction control. You can set transaction boundaries simply by adding a ʻatomic ()decorator to your function. It looks similar to a declarative transaction using@Transactional` of Spring Framework, so I think it will be familiar to those who have used Spring. (The mechanism is different because it is AOP and the specifications of the development language)

Of course, you can also nest transactions and explicitly control begin, rollback, and commit. For more information on transaction control, see http://docs.peewee-orm.com/en/latest/peewee/transactions.html.

3.2.8 insert in peewee

There are two ways to insert in peewee. Therefore, when developing with multiple members, it seems necessary to arrange the methods.

3.2.8.1 Record registration and instantiation at the same time

Use the create () method of the model class to register the record and instantiate it at the same time. Set the data required at the time of insert (all data of Not Null column) as an argument of create method.

question = Question.create(question_code = input.get("question_code"), \
    category = input.get("category"), \
    message = input.get("message"))

The demo app doesn't use this method because it doesn't seem to fit the O / R mapping function of mapping instances and records.

See http://docs.peewee-orm.com/en/latest/peewee/api.html#Model.create for more information.

3.2.8.2 Register records at any time after creating an instance

After creating the instance you want to register, register the record by executing the save () method of the instance at any time. At this time, the point is to set True to the force_insert property of the argument. (Force_insert = True)

Originally, the save () method is for updating (issuing update), but by setting force_insert = True, insert will be issued.

See http://docs.peewee-orm.com/en/latest/peewee/api.html#Model.save for more information.

3.3 Demo client

3.3.1 Source code

democlient.py


# -*- coding: utf-8 -*-
import requests

# if proxy pass error occurred, try "set NO_PROXY=127.0.0.1;localhost" at console
def register_question():
    print("[POST] /questions")
    
    question = {
        'question_code' : '9999999999',
        'category' : 'demo',
        'message' : 'hello'
    }
    headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
    response = requests.post('http://localhost:3000/questions',
        json=question, headers=headers)
    print(response.status_code)
    print(response.content)

# main
if __name__ == "__main__":
    register_question()

3.3.2 Sending HTTP requests via requests

Requests have methods that correspond to HTTP methods such as post and get. This time I used post to send the request by POST.

When sending data in json format, set the data in the json argument like json = question. Similarly, if you want to specify the HTTP header at the time of request, set it in the headers argument.

Actually, when you use the json argument, ʻapplication / json is automatically set in Content-type`. So, you don't really need to set it explicitly like the demo source code. (I wanted to know how to set the HTTP header, so I set it this time)

3.4 Operation check

3.4.1 Starting microservices

C:\tmp>python demoapp.py
 * Running on http://localhost:3000/ (Press CTRL+C to quit)

When scaling out, you may have multiple processes running on the same computer. At that time, it is necessary to change so that the listening port is not covered. The demo app allows you to overwrite the listening port from an environment variable, so you can change it at startup.

C:\tmp>set APP_PORT=3001
C:\tmp>python demoapp.py
 * Running on http://localhost:3001/ (Press CTRL+C to quit)

3.4.2 Starting the demo client

Run the demo client without the relevant data in the database. I think that it will be executed normally and the registered data will be returned as JSON.

C:\tmp>python democlient.py
[POST] /questions
200
{
  "content": {
    "category": "demo",
    "message": "hello",
    "question_code": "9999999999"
  },
  "result": true
}

Next, try running the demo client again in this state. Of course, you will get an error because you will be caught in a unique constraint. If the HTTP status of {'error':'ERROR'} is returned, the error handling you have set is also working.

C:\tmp>python democlient.py
[POST] /questions
500
{
  "error": "ERROR"
}

Finally, delete the relevant data from the DB and try again. It should work fine as the unique constraint error no longer occurs.

C:\tmp>python democlient.py
[POST] /questions
200
{
  "content": {
    "category": "demo",
    "message": "hello",
    "question_code": "9999999999"
  },
  "result": true
}

4 Finally

I explained how to create a microservice with Python using flask, cerberus, and peewee. I think I was able to introduce how easy it is to implement the HTTP request handling, input checking, and DB access functions that are the minimum required for a Web application by using the library introduced this time.

As I explained at the beginning of the concept application, we do not consider the function division and file division according to the responsibilities that are important in actual system development. In addition, it is necessary to separately consider authentication, authorization, flow control, monitoring, log acquisition, etc., which are essential for disclosing microservices to the outside.

5 Reference information

Markdown notation cheat sheet Markdown writing memo Rapid implementation of REST API in Python Try working with database using Python ORM Peewee

Recommended Posts

Microservices in Python (Overview)
Python: Preprocessing in Machine Learning: Overview
Quadtree in Python --2
CURL in python
Metaprogramming in Python
Python 3.3 in Anaconda
Geocoding in python
SendKeys in Python
Meta-analysis in Python
Unittest in python
Discord in Python
DCI in Python
quicksort in python
nCr in python
N-Gram in Python
Programming in python
Plink in Python
Constant in python
Lifegame in Python.
Sqlite in python
StepAIC in Python
N-gram in python
LINE-Bot [0] in Python
Csv in python
Disassemble in Python
Reflection in Python
Constant in python
nCr in Python.
format in python
Scons in Python3
Puyo Puyo in python
python in virtualenv
PPAP in Python
Quad-tree in Python
Reflection in Python
Chemistry in Python
Hashable in python
DirectLiNGAM in Python
LiNGAM in Python
Flatten in python
flatten in python
Sorted list in Python
Daily AtCoder # 36 in Python
Clustering text in Python
Daily AtCoder # 2 in Python
Implement Enigma in python
Daily AtCoder # 32 in Python
Daily AtCoder # 6 in Python
Edit fonts in Python
Singleton pattern in Python
File operations in Python
Read DXF in python
Daily AtCoder # 53 in Python
Key input in Python
Use config.ini in Python
Daily AtCoder # 33 in Python
Solve ABC168D in Python
Logistic distribution in Python
Daily AtCoder # 7 in Python
One liner in Python
Simple gRPC in Python