Create a python machine learning model relearning mechanism with mlflow

It is a work memo when I made a mechanism to relearn, evaluate accuracy, and update the operation model of scikit-learn model using mlflow of python library.

Rough requirements for the environment you want to create

After starting the production operation of the model, it is a mechanism to periodically relearn with new data and update the operation model. In short, this is the image.

Assumed use case example:

environment

Preparation

Introduced Anaconda

This time, we will add Anaconda to easily develop the model at Jupyter Lab. If you like other Python environment construction methods, please feel free to use them.

#update yum
sudo yum update

#Introducing git(For the introduction of pyenv)
sudu yum install -y git

#Introduction of pyenv
git clone git://github.com/yyuu/pyenv.git ~/.pyenv
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bash_profile
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bash_profile
echo 'eval "$(pyenv init -)"' >> ~/.bash_profile
source ~/.bash_profile

#Check the version of Anaconda you want to install
pyenv install -l | grep anaconda

#Introduction of Anaconda
pyenv install anaconda3-2020.07

#Confirmation of installed Anaconda version
pyenv versions

#Switch to Anaconda environment with python environment
pyenv global anaconda3-2020.07

Introduction of mlflow

mlflow is introduced with pip. At the same time, the prerequisites such as gunicorn and flask will be introduced.

pip install mlflow

Introduction of sqlite3

This time I will use sqlite3 which is the easiest to set up as the backend store of mlflow tracking server, but since it is included in anaconda3-2020.07 which has already been introduced, no additional steps are required.

If you set up the python environment in another way, install sqlite3 with pip.

pip install sqlite3

Start mlflow tracking server

mlflow uses two storage areas.

See the mlflow documentation for more details on both. https://mlflow.org/docs/latest/tracking.html#backend-stores https://mlflow.org/docs/latest/tracking.html#artifact-stores

This time, sqlite3, which can be easily prepared as Backend Store, is used, and Artifacts Store uses a local directory.

#Create directories for backend store and artifacts (optional))
sudo mkdir /mnt/share
sudo chmod 777 /mnt/share

#Start mlflow tracking server
mlflow server --backend-store-uri sqlite:////mnt/share/mlflow.db --default-artifact-root /mnt/share/mlflow_artifacts --host 0.0.0.0 --port 5000

(option) Firewall settings

If you want to access the URL of the mlflow tracking server (http: // IPaddress: 5000 in the above case) from the outside, firewall is suitable. The following is a batch stop of Firewall, which is often done for testing purposes.

#Stop firewalld
systemctl stop firewalld

#Stop firewalld autostart
systemctl disable firewalld

Access to mlflow tracking server

Access the tracking server http: // IPaddress: 5000 started with a web browser and confirm that the mlflow screen is displayed. Both Experiments and Models are still empty. image.png image.png

Model learning

From here, you can operate mlflow with python. The python development environment uses Jupyter Lab introduced by Anaconda. The explanation around Jupyter Lab is omitted.

Library import and constant definition

#Library import
import pandas as pd
import numpy as np
import copy
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report
import mlflow

train_result_file_name is used to output the model training result as a temporary text file and register it as Artifacts.

#Temporary output destination file name of detailed information of each model training
train_result_file_name = '/tmp/train_result.txt'

# mlflow tracking server URL
mlflow_tracking_server_url = 'http://localhost:5000'
mlflow.set_tracking_uri(mlflow_tracking_server_url)

The following is required for automatic model update. Note that you must either run it after set_tracking_uri above or specify tracking_uri ='http: // localhost: 5000' as an argument. Otherwise, you'll get an unknown KeyError and get hooked.

#Client instance of mlflow tracking server (for Production version operation)
client = mlflow.tracking.MlflowClient()
#client = mlflow.tracking.MlflowClient(tracking_uri=mlflow_tracking_server_url)

Definition of model type, objective variable, and explanatory variable

#Model type column name in the training data table ★★★ Case sensitivity needs to be considered ★★★
model_types_column_name = 'mtype'
#Model type
model_types = [
    {'name': 'type1', 'detail': 'Detailed explanation of model pattern 1'},
    {'name': 'type2', 'detail': 'Detailed explanation of model pattern 2'},
    {'name': 'type3', 'detail': 'Detailed explanation of model pattern 3'}
]
#Objective variable column name
target_val_colmun_name = 'target'
#Explanatory variable column name
feature_val_column_names = [
    'sepal length (cm)',
    'sepal width (cm)',
    'petal length (cm)',
    'petal width (cm)'
]

Reading training data

#Use iris as sample data. 150 cases in total
from sklearn.datasets import load_iris
iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['target'] = iris.target_names[iris.target]

#Model type column is generated by random numbers
import random
random.seed(1)

mtype_list = []
for i in range(len(df)):
    mtype_list.append(model_types[random.randint(0,len(model_types)-1)]['name'])

df[model_types_column_name] = mtype_list

If you check with df.head () on JupyterLab, you can check the following data. image.png

Model learning

We are creating three models by looping for each model type. In addition, since it is assumed that it will be retrained regularly, all data will be used for training without splitting test and train. After learning, we also record Experiments using mlflow and register models.

#Get current date(Used for the name of run in Experiments)
from pytz import timezone
import datetime
now_datetime = datetime.datetime.now(timezone('Asia/Tokyo')).strftime('%Y%m%d_%H%M%S')
now_date = now_datetime[0:8]
for model_type in model_types:
    #Cutting out data for each model type
    df_mtype = df[df[model_types_column_name] == model_type['name']]
    
    #Splitting explanatory variables and objective variables
    X = df_mtype[feature_val_column_names]
    y = df_mtype[target_val_colmun_name]

    #Model learning
    model = RandomForestClassifier(max_depth=2, random_state=5, n_estimators=10)
    model.fit(X, y)

    #Accuracy for training data(Reference value)
    predicted = model.predict(X)
    ac_score = accuracy_score(y, predicted)
    con_matrix = confusion_matrix(y, predicted)
    clf_report = classification_report(y, predicted)
    
    #Registration of learning results in mlflow
    mlflow.set_experiment('Model learning history')    
    mlflow_run_name = now_date+'Learning results_' + model_type['name']
    with mlflow.start_run(run_name=mlflow_run_name) as run:
        #Record to run as a parameter
        mlflow.log_param('Model type name', model_type['name'])
        mlflow.log_param('Number of training data', len(X))
        mlflow.log_param('Correct answer rate_Training data',ac_score) # log_metric is included in param because Japanese cannot be used

        #Output the details of the model training result to a file and register it in the artifact of run
        with open(train_result_file_name, 'w') as f:
            print('Model type name:', model_type['name'], file=f)
            print('Objective variable:', target_val_colmun_name, file=f)
            print('Explanatory variable:', feature_val_column_names, file=f)
            print('Number of training data:', len(X), '\n', file=f)
            print('Model accuracy(Training data)', '\n', '------------------------------------------------', file=f)
            print('Confusion matrix(confusion_matrix) :', '\n', confusion_matrix(y,predicted), file=f)
            print('Objective variable label', np.sort(y.unique()), '\n', file=f)
            print('Correct answer rate(accuracy) :', '\n', ac_score, '\n', file=f)
            print('Accuracy report(classification_report) :', '\n', clf_report, '\n', file=f)
        mlflow.log_artifact(train_result_file_name)

        #Register model to run
        mlflow.sklearn.log_model(sk_model=model, artifact_path='model')

    #Register model with Models(Register model)
    model_uri = 'runs:/{}/model'.format(run.info.run_id)
    reg_model = mlflow.register_model(model_uri, model_type['name'])

    #Production model update mode(auto :Automatic updating(Immediate reflection of re-learning model) / manual :Manual update(Responsible person operates on the screen))
    #---------------------------------------------------------------#
    #For production use, this value is obtained from DB etc. This time, I will tentatively hard-code it.
    #---------------------------------------------------------------#
    model_update_mode = 'manual'

    #If the model version is 1, be sure to register it as Production
    if reg_model.version == '1':
        client.transition_model_version_stage(
            name = model_type['name'],
            version = reg_model.version,
            stage = "Production",
            archive_existing_versions = True
        )
    else:
        #Model update mode:Automatic
        if model_update_mode == 'auto':
            client.transition_model_version_stage(
                name = model_type['name'],
                version = reg_model.version,
                stage = "Production",
                archive_existing_versions = True
            )

Learning result (1st time)

This is the state after executing the learning code once. The learning results are recorded in Experiments on the mlflow screen. image.png

(Commentary)

This is the screen where you clicked the Start Time link in the model learning history. image.png image.png

The model that is the training result is displayed in the Artifacts part. You can see that the Artifacts Store of this tracking server uses a local directory, and its storage location is / mnt/share/mlflow_artifacts/1/950d96375b044a2383cde334ff86534b/artifacts/model. In the Make Predictions section, you can also see sample code for calling the model and executing the predictions.

You can freely register files in Artifacts with mlflow.log_artifact. This time, the evaluation result of the learning model is registered as a text file. image.png

This is the Models screen. You can see from mlflow.register_model that the trained models of model types types 1-3 are registered as Version 1. Furthermore, by client.transition_model_version_stage, if the registered model is version 1, it is also registered as Production as the initial model. This assumes that the number of model types has increased during operation. image.png

By the way, mlflow also has a label called Staging in addition to Production, but I have not used it this time. It would be nice if this label could be defined freely, but it seems that there is no such function in mlflow at the moment (January 2021).

Learning result (2nd time)

This is the state of Experiments after executing the learning code again. Three new runs are registered in the model learning history. image.png

On the Models screen, Version 2 is registered as the Latest Version. Production remains Version 1 as it is intended for manual updates without automatic model updates. image.png

Model accuracy evaluation

In order to compare the prediction accuracy of each model with respect to the latest evaluation data, the model for evaluation is executed and recorded in Experiments. It is assumed that regular batch execution or manual batch execution is performed in a shorter cycle than model retraining.

The following parts are the same as model learning. The evaluation data should be the latest data, but this time we are using the same data as for learning.

#Library import
import pandas as pd
import numpy as np
import copy
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report
import mlflow

#Temporary output destination file name of detailed information of each model evaluation
eval_result_file_name = '/tmp/eval_result.txt'

# mlflow tracking server URL
mlflow_tracking_server_url = 'http://localhost:5000'
mlflow.set_tracking_uri(mlflow_tracking_server_url)

#Client instance of mlflow tracking server (for Production version operation)
#The following is set_tracking_Run after uri or tracking_uri='http://localhost:5000'Must be specified. Otherwise, an unknown KeyError will occur
client = mlflow.tracking.MlflowClient()
#client = mlflow.tracking.MlflowClient(tracking_uri=mlflow_tracking_server_url)

Model type column name in the training data table ★★★ Case sensitivity needs to be considered ★★★
model_types_column_name = 'mtype'
#Model type
model_types = [
    {'name': 'type1', 'detail': 'Detailed explanation of model pattern 1'},
    {'name': 'type2', 'detail': 'Detailed explanation of model pattern 2'},
    {'name': 'type3', 'detail': 'Detailed explanation of model pattern 3'}
]
#Objective variable column name
target_val_colmun_name = 'target'
#Explanatory variable column name
feature_val_column_names = [
    'sepal length (cm)',
    'sepal width (cm)',
    'petal length (cm)',
    'petal width (cm)'
]

#Use iris as sample data. 150 cases in total
from sklearn.datasets import load_iris
iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['target'] = iris.target_names[iris.target]

#Model type column is generated by random numbers
import random
random.seed(1)

mtype_list = []
for i in range(len(df)):
    mtype_list.append(model_types[random.randint(0,len(model_types)-1)]['name'])

df[model_types_column_name] = mtype_list

#Get current date(Used for the name of run in Experiments)
from pytz import timezone
import datetime
now_datetime = datetime.datetime.now(timezone('Asia/Tokyo')).strftime('%Y%m%d_%H%M%S')
now_date = now_datetime[0:8]

From here, it is model execution and accuracy measurement for evaluation. First, call the model to be evaluated and store it in a dictionary variable. This time, for the sake of simplicity, all versions of all model types are evaluated.

#Get all versions of all model types
loaded_models = {}
for model_type in model_types:
    loaded_models_ver = {}
    for i in range(1,100):       
        try:
            loaded_models_ver['Version'+str(i)] = mlflow.sklearn.load_model('models:/'+model_type['name']+'/'+str(i))
        except Exception as e:
            loaded_models_ver['Version'+str(i)] = 'NONE'
            loaded_models_ver['latest_version'] = i-1
            break
    loaded_models[model_type['name']] = loaded_models_ver

~~ In addition, there is one part that is not good in the above code. This is the part where the version number is fixed at 1 to 100 in the inner for statement. I searched various APIs, but I couldn't find a way to get the latest version number of the model, so I had no choice but to count up the version numbers one by one and the latest version was one before the model load failed. I am trying to determine that. ** If anyone knows a smart way to get the latest version number of a model registered in Models, please let us know. ** ~~

(Added on 2020/1/5) I was taught how to search by search_model_versions ("name ='<name>' ") using MlflowClient (). Now you can get a smart list of version numbers. I will update the article if I can verify it later.


If you display loaded_models at this point, it will be in the following state. image.png

From here, the model is called, the prediction is executed, and the result is recorded by mlflow. It is quite similar to the time of training, but the model to be used is sequentially replaced with the model called above and the prediction is executed.

# evaluation
for model_type in model_types:
    #Cutting out data for each model type
    df_mtype = df[df[model_types_column_name] == model_type['name']]
    
    #Splitting explanatory variables and objective variables
    X = df_mtype[feature_val_column_names]
    y = df_mtype[target_val_colmun_name]
   
    #Registration of evaluation results in mlflow
    mlflow_experiment_name = now_date+'Evaluation_' + model_type['name']
    mlflow.set_experiment(mlflow_experiment_name)
    
    for i in range(1,100):
        model = loaded_models[model_type['name']]['Version'+str(i)]
        if model == 'NONE': break
         
        #Accuracy to evaluation data
        predicted = model.predict(X)
        ac_score = accuracy_score(y, predicted)
        con_matrix = confusion_matrix(y, predicted)
        clf_report = classification_report(y, predicted)
    
        mlflow_run_name = 'Version'+str(i)
        with mlflow.start_run(run_name=mlflow_run_name) as run:
            #Record to run as a parameter
            mlflow.log_param('Version', i)
            mlflow.log_param('Model type name', model_type['name'])
            mlflow.log_param('Number of evaluation data', len(X))
            mlflow.log_metric('accuracy',ac_score) # log_I can't use Japanese for metric, but I use metric for comparison.

            #Output the details of the model training result to a file and register it in the artifact of run
            with open(eval_result_file_name, 'w') as f:
                print('Model type name:', model_type['name'], file=f)
                print('Model version:', 'Version'+str(i), file=f)                
                print('Objective variable:', target_val_colmun_name, file=f)
                print('Explanatory variable:', feature_val_column_names, file=f)
                print('Number of evaluation data:', len(X), '\n', file=f)
                print('Model accuracy(Evaluation data)', '\n', '------------------------------------------------', file=f)
                print('Confusion matrix(confusion_matrix) :', '\n', confusion_matrix(y,predicted), file=f)
                print('Objective variable label', np.sort(y.unique()), '\n', file=f)
                print('Correct answer rate(accuracy) :', '\n', ac_score, '\n', file=f)
                print('Accuracy report(classification_report) :', '\n', clf_report, '\n', file=f)
            mlflow.log_artifact(eval_result_file_name)

Evaluation results

This is the Experiments screen of mlflow after executing the evaluation code. The Experiments name is the evaluation date + model type. Run is recorded for each model version. image.png

Here, if you check the Run you want to compare and click the "Compare" button, image.png

The model version comparison screen is displayed. It is assumed that the person in charge of model operation looks at this screen and confirms the accuracy of multiple versions. image.png If you click the accruracy link, the comparison screen will be displayed as a graph as shown below. Since the model created this time has the same training data for both Version 1 and 2, it has exactly the same accuracy, but in the production scene, you can visually compare the accuracy of each version here. image.png

Operation model update

If you want to use the automatic update method, you can execute client.transition_model_version_stage at the end of the learning code to update the generated model as Production.

The following is for the manual update method. After the model operation staff confirms the model evaluation result on the mlflow screen, the model (Production) in operation is updated by manual operation on the mlflow screen.

On the Models screen, open the version of the model you want to produce, and select "Transition to Production" from the "Stage" pull-down menu on the upper right. image.png A pop-up will appear, so just click OK. image.png By the way, Archived is just a label name that means unnecessary, and the model will not be deleted immediately. Use client.delete_model_version to delete a model.

If you look at the Models list screen, you can see that Version 2 has changed to Production. image.png

(Supplement) Online execution of model

For the Production model registered in Models, the so-called Online model execution environment is started by starting the service with the mlflow models serve command.

As a caveat, you must specify the tracking server in the environment variable MLFLOW_TRACKING_URI before running. https://mlflow.org/docs/latest/cli.html#mlflow-models

#Specify tracking server with environment variable
export MLFLOW_TRACKING_URI=http://localhost:5000

#Run the Production version of model name type1
mlflow models serve -m models:/type1/Production -h 0.0.0.0 -p 7001

This exposes the type1 Production model at the URL http: // IPaddress: port/Invocations. If you POST the prediction input data in JSON from the REST Client here, the prediction result will be obtained as a response.

Model prediction execution example from REST Client (Insomnia): image.png

The input JSON data can be in the format of orient ='split' in to_json of the pandas data frame. Example: JSONize one record of training data

X.head(1).to_json(orient='split',index=False)
# {"columns":["sepal length (cm)","sepal width (cm)","petal length (cm)","petal width (cm)"],"data":[[4.9,3.0,1.4,0.2]]}

that's all.

Recommended Posts

Create a python machine learning model relearning mechanism with mlflow
Build a Python machine learning environment with a container
Machine learning with Python! Preparation
Beginning with Python machine learning
Create a directory with python
Run a machine learning pipeline with Cloud Dataflow (Python)
Create a machine learning environment from scratch with Winsows 10
Machine learning beginners tried to make a horse racing prediction model with python
[Machine learning] Create a machine learning model by performing transfer learning with your own data set
Create a machine learning app with ABEJA Platform + LINE Bot
Machine learning with python (1) Overall classification
Create a virtual environment with Python!
Inversely analyze a machine learning model
Until you create a machine learning environment with Python on Windows 7 and run it
"Scraping & machine learning with Python" Learning memo
A story about developing a machine learning model while managing experiments and models with Azure Machine Learning + MLflow
Building a Windows 7 environment for getting started with machine learning with Python
How to create a serverless machine learning API with AWS Lambda
Create a Python function decorator with Class
Build a blockchain with Python ① Create a class
Amplify images for machine learning with python
Create a dummy image with Python + PIL.
Machine learning with python (2) Simple regression analysis
[Python] Create a virtual environment with Anaconda
Let's create a free group with Python
A story about machine learning with Kyasuket
[Shakyo] Encounter with Python for machine learning
Create a word frequency counter with Python 3.4
Build AI / machine learning environment with Python
REST API of model made with Python with Watson Machine Learning (CP4D edition)
A beginner of machine learning tried to predict Arima Kinen with python
Create a frame with transparent background with tkinter [Python]
[Python] Easy introduction to machine learning with python (SVM)
Machine learning starting with Python Personal memorandum Part2
Machine learning starting with Python Personal memorandum Part1
Create a LINE BOT with Minette for Python
Create a virtual environment with conda in Python
Create a page that loads infinitely with python
[Note] Create a one-line timezone class with python
You can easily create a GUI with Python
Create a python3 build environment with Sublime Text3
Create a color bar with Python + Qt (PySide)
Steps to create a Twitter bot with python
Create a simple momentum investment model in Python
Create a decision tree from 0 with Python (1. Overview)
Create a new page in confluence with Python
Create a color-specified widget with Python + Qt (PySide)
Get a glimpse of machine learning in Python
Create a Photoshop format file (.psd) with python
I started machine learning with Python Data preprocessing
Create a Python console application easily with Click
Learning Python with ChemTHEATER 03
"Object-oriented" learning with python
Learning Python with ChemTHEATER 05-1
Create a Python module
Learning Python with ChemTHEATER 02
Create a Python environment
Simulate a good Christmas date with a Python optimized model
Build a machine learning Python environment on Mac OS
[Python] Create a ValueObject with a complete constructor using dataclasses
Why not create a stylish table easily with Python?