[PYTHON] Free version of DataRobot! ?? Introduction to "PyCaret", a library that automates machine learning

What is PyCaret

I just saw an article called Announcing PyCaret 1.0.0. .. Since it was an interesting library, this article will explain how to actually use PyCaret. ** PyCaret is a Python library that allows you to perform data preprocessing, visualization, and model development in machine learning model development with just a few lines of code. ** **

PyCaret is a Python wrapper for some of the major machine learning libraries (scikit-learn, XGBoost, LightGBM, etc.). It can handle classification, regression, clustering, anomaly detection, and natural language processing. So to speak, PyCaret is like a free version of DataRobot.

Basically, it seems that you can do everything from preprocessing, modeling, performance evaluation, tuning, and visualization. In addition, stacking can be done. (Some evaluation indexes such as time series analysis and log loss are provided.) PyCaret/Github

Also, the target audience has already introduced a series of machine learning using scikit-learn! Is the one who says. (If you are a beginner in machine learning, you may not understand the contents. Let's spend some time for the time being .. I think it ends with the feeling. I'm sorry for the in-house service, but [AI Academy](https: // aiacademy. Try to get started with machine learning programming using jp /) etc.)

There is also an explanation video of kaggle's Credit Card Fraud Detection using PyCaret, so please take a look.

Let's try PyCaret!

First, install PyCaret.

When installing from a terminal or command prompt, you can install with the following command.

pip install pycaret

In Jupyter Notebook and Google Colab, you can install it with the following command with a! At the beginning.

!pip install pycaret

Load necessary modules & prepare data

This time, I will try multi-class classification using the iris dataset. First, load the required code.

import warnings
#I'll erase unnecessary warnings
warnings.filterwarnings("ignore")
#The leading role this time! Load PyCaret.
from pycaret.classification import *
#Load the Iris dataset.
from sklearn.datasets import load_iris
#Since it handles data frames, it also reads Pandas.
import pandas as pd

Next, prepare the data.

iris = load_iris()
X = pd.DataFrame(iris.data, columns=iris.feature_names)
y = pd.DataFrame(iris.target, columns=["target"])
df = pd.concat([X,y], axis=1)

The first 5 items are displayed.

df.head()

Preprocessing

Now! Preprocessing.

Use ** setup () ** to handle missing values, split data, etc.

Pass the objective variable to target.

exp1 = setup(df, target = 'target')

Let's compare models

To compare models, just use ** compare_models () **.

compare_models()
スクリーンショット 2020-04-21 0.14.17.png

modeling

Enter the name of the algorithm used for learning by referring to https://pycaret.org/create-model/. This time, we will use the "Quadratic Discriminant Analysis" and the decision tree, which have the highest accuracy rate. For Quadratic Discriminant Analysis, you can enter'qda', so enter qda this time.

qda = create_model('qda')
スクリーンショット 2020-04-21 0.14.38.png

Let's also try the decision tree.

tree = create_model('dt')
スクリーンショット 2020-04-21 0.14.44.png

tuning

Let's tune the decision tree.

tuned_tree = tune_model('dt')
スクリーンショット 2020-04-21 0.14.44.png

Get parameters

tuned_tree.get_params

Model visualization

plot_model(tuned_qda)

download.png

plot_model(tuned_tree)

download-1.png

Ensemble learning

lgbm = create_model('lightgbm')
xgboost = create_model('xgboost')

ensemble = blend_models([lgbm, xgboost])
スクリーンショット 2020-04-21 0.15.43.png

Stacking

stack = stack_models(estimator_list = [xgboost], meta_model = lgbm)
スクリーンショット 2020-04-21 0.15.48.png

Forecast

pred = predict_model(qda)

Yes, convenient.

Finally

In just a few lines, I was able to complete it. .. I think I'll use it little by little from now on.

Reference site

This is an article that I used as a reference. Please refer to it in addition to this article.

Reference article 1 Reference article 2 [Reference article 3](https://techtech-sorae.com/%E6%A9%9F%E6%A2%B0%E5%AD%A6%E7%BF%92%E3%81%AE%E8%87 % AA% E5% 8B% 95% E5% 8C% 96% E3% 83% A9% E3% 82% A4% E3% 83% 96% E3% 83% A9% E3% 83% AA% E3% 80% 8Cpycaret % E3% 80% 8D% E3% 82% 92% E4% BD% BF% E3% 81% A3% E3% 81% A6% E3% 81% BF% E3% 81% 9F /

The person who wrote this article

Cyber Brain Co., Ltd. CEO Kazunori Tani We look forward to your follow-up! Twitter Facebook We also run an AI community with more than 5,000 participants. We provide information about AI every day, so we look forward to your participation! Artificial Intelligence Research Community AI Academy

Recommended Posts

Free version of DataRobot! ?? Introduction to "PyCaret", a library that automates machine learning
Introduction to Machine Learning Library SHOGUN
Python & Machine Learning Study Memo ②: Introduction of Library
Introduction to machine learning
A quick introduction to the neural machine translation library
Installation of TensorFlow, a machine learning library from Google
An introduction to machine learning from a simple perceptron
An introduction to machine learning
Super introduction to machine learning
Introduction to machine learning Note writing
A story stuck with the installation of the machine learning library JAX
Introduction to Python Basics of Machine Learning (Unsupervised Learning / Principal Component Analysis)
[Introduction to StyleGAN] Unique learning of anime with your own machine ♬
An introduction to OpenCV for machine learning
Introduction of "scikit-mobility", a library that allows you to easily analyze human flow data with Python (Part 1)
Introduction to ClearML-Easy to manage machine learning experiments-
Take the free "Introduction to Python for Machine Learning" online until 4/27 application
A beginner of machine learning tried to predict Arima Kinen with python
I tried to visualize the model with the low-code machine learning library "PyCaret"
An introduction to Python for machine learning
I tried to understand the learning function of neural networks carefully without using a machine learning library (first half).
9 Steps to Become a Machine Learning Expert in the Shortest Time [Completely Free]
[Super Introduction] Machine learning using Python-From environment construction to implementation of simple perceptron-
Implementation of a model that predicts the exchange rate (dollar-yen rate) by machine learning
[Python] Easy introduction to machine learning with python (SVM)
Machine learning memo of a fledgling engineer Part 1
[Super Introduction to Machine Learning] Learn Pytorch tutorials
An introduction to machine learning for bot developers
List of links that machine learning beginners are learning
[Super Introduction to Machine Learning] Learn Pytorch tutorials
[Translation] scikit-learn 0.18 Tutorial Introduction of machine learning by scikit-learn
Machine learning memo of a fledgling engineer Part 2
Get a glimpse of machine learning in Python
[For beginners] Introduction to vectorization in machine learning
Arrangement of self-mentioned things related to machine learning
Introduction of automatic image collection package "icrawler" (0.6.3) that can be used during machine learning
[Introduction to Python] Basic usage of the library scipy that you absolutely must know
Simple code that gives a score of 0.81339 in Kaggle's Titanic: Machine Learning from Disaster
(Note) A web application that uses TensorFlow to infer recommended song names [Machine learning]
Introduction to machine learning ~ Let's show the table of K-nearest neighbor method ~ (+ error handling)
About data preprocessing of systems that use machine learning
Machine learning beginners try to make a decision tree
[Python] A convenient library that converts kanji to hiragana
[Introduction to Python] Basic usage of the library matplotlib
DEEP PROBABILISTIC PROGRAMMING --- "Deep Learning + Bayes" Library --- Introduction of Edward
MALSS, a tool that supports machine learning in Python
Machine learning and statistical prediction, a paradigm of modern statistics that you should know before that
I tried to compare the accuracy of machine learning models using kaggle as a theme.
Matching app I tried to take statistics of strong people & tried to create a machine learning model
An example of a mechanism that returns a prediction by HTTP from the result of machine learning
A story that contributes to new corona analysis using a free trial of Google Cloud Platform
A simple version of government statistics (immigration control) that is easy to handle with jupyter
I made a tool that makes it convenient to set parameters for machine learning models.
Machine learning library dlib
Machine learning library Shogun
Try to evaluate the performance of machine learning / regression model
Introduction to Machine Learning with scikit-learn-From data acquisition to parameter optimization
Try to evaluate the performance of machine learning / classification model
[Introduction to AWS] A memorandum of building a web server on AWS
How to increase the number of machine learning dataset images
[Machine learning] I tried to summarize the theory of Adaboost