[PYTHON] I tried to visualize the model with the low-code machine learning library "PyCaret"

Overview

things to do

The list is as follows, but it can be executed in a few lines by automating pycaret.

Try it (⑤ Visualization of model)

⑤ Model visualization

evaluate_model(tuned_model)

AUC (ROC curve)

Confusion Matrix

Error

Dicision Boundary

# Logistic
Regression
K Nearest
Neighbour
Gaussian
Process
boundary lr.png knn.png GP.png
Feature Because of the linear algorithm
The decision boundary is also straight
Grouping nearby points
Boundary
Be aware of the bell curve
Smooth curved surface

Threshold

Precision Recall

Learning Curve

Validation Curve

algorithm Horizontal axis algorithm Horizontal axis
Decision Tree
Random Forest
Gradient Boosting
Extra Trees Classifier
Extreme Gradient Boosting
Light Gradient Boosting
CatBoost Classifier
max_depth Logistic Regression
SVM (Linear)
SVM (RBF)
C
Multi Level Perceptron (MLP)
Ridge Classifier
alpha AdaBoost n_estimators
K Nearest Neighbour(knn) n_neighbors Gaussian Process(GP) max_iter_predict
Quadratic Disc. Analysis (QDA) reg_param Naives Bayes var_smoothing

Feature Importance

Manifold Learning

download.png

Dimensions

Try it ((1) Data load- (4) Tuning)

① Data load

from pycaret.datasets import get_data
#Load the credit dataset.
#If you specify the profile option as True, pandas-EDA by profiling runs.
data = get_data('credit',profile=False)

② Pretreatment

from pycaret.classification import *
exp1 = setup(data, target = 'default')

③ Model comparison

compare_models(sort="AUC")

④ Parameter tuning

tuned_model = tune_model(estimator='lightgbm')

tune_model_v1.gif

*The algorithms that can be specified are as follows.docstringBut you can check it.

algorithm Specifying the estimator algorithm Specifying the estimator
Logistic Regression 'lr' Random Forest 'rf'
K Nearest Neighbour 'knn' Quadratic Disc. Analysis 'qda'
Naives Bayes 'nb' AdaBoost 'ada'
Decision Tree 'dt' Gradient Boosting 'gbc'
SVM (Linear) 'svm' Linear Disc. Analysis 'lda'
SVM (RBF) 'rbfsvm' Extra Trees Classifier 'et'
Gaussian Process 'gpc' Extreme Gradient Boosting 'xgboost'
Multi Level Perceptron 'mlp' Light Gradient Boosting 'lightgbm'
Ridge Classifier 'ridge' CatBoost Classifier 'catboost'

#Summary *I have written the method of visualizing the model separately, so I would like to finish organizing it by application at the end. *Assuming input data-> modeling-> results, I would like to group them for the following 5 purposes. * A)Understand the input data and features themselves. * B)Understand the features that the model is looking at. * C)Determine the model's learning status (insufficient learning, overfitting). * D)Consider the predictive characteristics of the model and the thresholds at which the objectives can be achieved. * E)Understand the prediction performance and prediction results of the model.

Use Perspective Visualization means
A)Understand the input data and features themselves. Is positive / negative data separable? Manifold Learning download.png
Same as above Dimensions download.png
B)Understand the features that the model is looking at. Which features are important Feature Importance download.png
C)Determine the model's learning status (insufficient learning, overfitting). Can prediction performance be improved by increasing the number of learnings? Learning Curve income_lc.png
Is overfitting suppressed by regularization? Validation Curve income_vc.png
D)Consider the predictive characteristics of the model and the thresholds at which the objectives can be achieved. Which threshold value corresponds to the desired prediction characteristic? Threshold download.png
What is the relationship between Precision and Recall? Precision Recall download.png
E)Understand the prediction performance and prediction results of the model. What is the AUC (Predictive Performance)? AUC download.png
Understand the boundaries of the results Decision Boundary mc_db.png
Understand how to make mistakes Confusion Matrix mc_conf.png
Same as above Error mc_err.png

#Finally *Thank you for staying with us. *If you don't mindLike, shareI would be happy if you could. *If there is a response to some extent, I will write a masterpiece (parameter explanation, etc.).

Recommended Posts

I tried to visualize the model with the low-code machine learning library "PyCaret"
I tried to move machine learning (ObjectDetection) with TouchDesigner
I tried to compress the image using machine learning
I tried to organize the evaluation indexes used in machine learning (regression model)
I tried machine learning with liblinear
[TF] I tried to visualize the learning result using Tensorboard
[Machine learning] I tried to summarize the theory of Adaboost
I tried to divide with a deep learning language model
Machine learning model management to avoid quarreling with the business side
[Python] I tried to visualize the night on the Galactic Railroad with WordCloud!
I tried calling the prediction API of the machine learning model from WordPress
I tried to make Othello AI with tensorflow without understanding the theory of machine learning ~ Introduction ~
I tried to save the data with discord
I tried using the trained model VGG16 of the deep learning library Keras
(Machine learning) I tried to understand Bayesian linear regression carefully with implementation.
I tried to predict Titanic survival with PyCaret
I tried to predict the behavior of the new coronavirus with the SEIR model.
I tried to make Othello AI with tensorflow without understanding the theory of machine learning ~ Implementation ~
I tried to understand the learning function in the neural network carefully without using the machine learning library (second half).
I tried to make Othello AI with tensorflow without understanding the theory of machine learning ~ Battle Edition ~
I tried to extract named entities with the natural language processing library GiNZA
Machine learning beginners tried to make a horse racing prediction model with python
I tried to easily visualize the tweets of JAWS DAYS 2017 with Python + ELK
I wanted to visualize 3D particle simulation with the Python visualization library Matplotlib.
I tried to predict the presence or absence of snow by machine learning.
I tried to predict the change in snowfall for 2 years by machine learning
I tried to implement various methods for machine learning (prediction model) using scikit-learn.
I tried to process and transform the image and expand the data for machine learning
I tried to implement Cifar10 with SONY Deep Learning library NNabla [Nippon Hurray]
I tried to create a model with the sample of Amazon SageMaker Autopilot
I tried to learn the sin function with chainer
I tried to touch the CSV file with Python
I tried to solve the soma cube with python
I tried to solve the problem with Python Vol.1
I tried clustering with PyCaret
I tried the changefinder library!
Calibrate the model with PyCaret
I tried to understand the learning function of neural networks carefully without using a machine learning library (first half).
I tried to make something like a chatbot with the Seq2Seq model of TensorFlow
I tried to make a real-time sound source separation mock with Python machine learning
I tried to solve the virtual machine placement optimization problem (simple version) with blueqat
I tried "Lobe" which can easily train the machine learning model published by Microsoft.
I tried to visualize the characteristics of new coronavirus infected person information with wordcloud
I tried to visualize the running data of the racing game (Assetto Corsa) with Plotly
I tried to build an environment for machine learning with Python (Mac OS X)
I tried to find the entropy of the image with python
I tried to simulate how the infection spreads with Python
I tried to analyze the whole novel "Weathering with You" ☔️
Uncle SE with hardened brain tried to study machine learning
Try to evaluate the performance of machine learning / regression model
I tried using the Python library from Ruby with PyCall
I tried to notify the train delay information with LINE Notify
I installed the automatic machine learning library auto-sklearn on centos7
Try to evaluate the performance of machine learning / classification model
I tried machine learning to convert sentences into XX style
[Python] I tried to visualize tweets about Corona with WordCloud
[Python] I tried to visualize the follow relationship of Twitter
I tried to implement ListNet of rank learning with Chainer
I tried to predict Boston real estate prices with PyCaret
I captured the Touhou Project with Deep Learning ... I wanted to.
I tried to divide the file into folders with Python