Machine learning A story about people who are not familiar with GBDT using GBDT in Python

For some reason, there weren't many GBDT pages in Japanese, so I wrote it. As the title suggests, I'm a person who doesn't understand machine learning well, so I can't deal with tsukkomi from apt people. Please note. I hope it helps people like me who "know a little about machine learning but don't know anything when it comes to advanced topics."

What is GBDT

One of the supervised machine learning. Abbreviation for Gradient Boosting Decision Tree. It clusters based on the teacher data that SVM wants to see. However, unlike SVM, which is a basic binary classification, other class classification is possible. Differences from other classifications such as Random forest are under-researched. sorry. I will omit detailed stories such as theory because they are compiled by smarter people. http://www.housecat442.com/?p=480 http://qiita.com/Quasi-quant2010/items/a30980bd650deff509b4

I will try it for the time being

There was Article solving CodeIQ problem with SVM, so I will try to imitate it with GBDT.

sample_gbdt.py


# -*- coding: utf-8 -*-
from sklearn.ensemble import GradientBoostingClassifier
import numpy as np

#Training data
train_data = np.loadtxt('CodeIQ_auth.txt', delimiter=' ')
X_train = [[x[0], x[1]] for x in train_data]
y_train = [int(x[2]) for x in train_data]

#Test data
X_test = np.loadtxt('CodeIQ_mycoins.txt', delimiter=' ')
y_test = np.array([1,0,0,1,1,0,1,1,1,0,0,1,1,0,0,1,0,0,0,1])

clf = GradientBoostingClassifier(n_estimators=100, learning_rate=0.1,
        max_depth=1).fit(X_train, y_train)

print ("Predict ",clf.predict(X_test))
print ("Expected", y_test)
print clf.score(X_test, y_test)

I used to use SVM, but I just do it with GBDT. This time it is a binary classification. It seems that if you increase the types of labels, they will classify them into other classes accordingly.

Click here for results

('Predict ', array([1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1]))
('Expected', array([1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1]))
0.85

You made a mistake in three places. .. ..

Parameter adjustment

Let's take a look at the sklearn page here. You don't have to look at it separately. http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html Lots of parameters. 15 pieces. Moreover, I don't understand well in English. However, if you look closely, the default of max_depth is 3. My own is 1. Why?

That's why I fixed it and tried again.

('Predict ', array([1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1]))
('Expected', array([1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1]))
0.95

It went up. It's up. It went up, but after all I made a mistake in one place. Let's get on with it and raise max_depth. For example, 5

('Predict ', array([1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0]))
('Expected', array([1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1]))
0.9

It went down ... I knew it wasn't something I should give, but that's right. But even for problems of this scale, changing the parameters will change the results. I learned firsthand how important parameter adjustment is in machine learning.

And this parameter adjustment. I feel that I can't do well without a little more specialized knowledge. It seems difficult to move it a little with python and be happy.

I will study a little more seriously and start again.

Recommended Posts

Machine learning A story about people who are not familiar with GBDT using GBDT in Python
A story about machine learning with Kyasuket
How about Anaconda for building a machine learning environment in Python?
A story about a student who does not know the machine learning machine learned machine learning (deep learning) for half a year
A story about automating online mahjong (Mahjong Soul) with OpenCV and machine learning
Impressions of people with experience in other languages learning Python using PyQ
Get a glimpse of machine learning in Python
A story about data analysis by machine learning
Build a Python machine learning environment with a container
[Solved] I have a question for those who are familiar with Python mechanize.
What I learned about AI / machine learning using Python (1)
A story about predicting exchange rates with Deep Learning
Run a machine learning pipeline with Cloud Dataflow (Python)
A story about trying a (Golang +) Python monorepo with Bazel
What I learned about AI / machine learning using Python (3)
Build a machine learning application development environment with Python
Memo for building a machine learning environment using Python
What I learned about AI / machine learning using Python (2)
MALSS, a tool that supports machine learning in Python
A story about how to specify a relative path in python.
(Note) A story about creating a question answering system using Spring Boot and machine learning (SVM)
A story about competing with a friend in Othello AI Preparation
What I learned about AI and machine learning using Python (4)
Machine learning with Python! Preparation
Environment construction procedure for those who are not familiar with the python version control system
A story about an amateur making a breakout with python (kivy) ②
[Note] A story about trying to override a class method with two underscores in Python 3 series.
A story about an amateur making a breakout with python (kivy) ①
Create a python machine learning model relearning mechanism with mlflow
Beginning with Python machine learning
A story about trying to implement a private variable in Python.
A story about developing a machine learning model while managing experiments and models with Azure Machine Learning + MLflow
A story about a python beginner stuck with No module named'http.server'
A story stuck with the installation of the machine learning library JAX
[Tutorial] Make a named entity extractor in 30 minutes using machine learning
A story about adding a REST API to a daemon made with Python
Building a Windows 7 environment for getting started with machine learning with Python
People who are accustomed to Android programs try multithreading with Python
Links to people who are just starting data analysis with python
A story about developing a soft type with Firestore + Python + OpenAPI + Typescript
Under investigation about PYNQ-Let's do deep learning with FPGA using Python-
Python: Preprocessing in Machine Learning: Overview
A story about using Python's reduce
"Scraping & machine learning with Python" Learning memo
Coursera Machine Learning Challenges in Python: ex7-1 (Image compression with K-means clustering)
A story about a beginner making a VTuber notification bot from scratch in Python
Vulkan compute with Python with VkInline and think about GPU machine learning and more
A beginner of machine learning tried to predict Arima Kinen with python
Create a record with attachments in KINTONE using the Python requests module
I wrote FizzBuzz in python using a support vector machine (library LIVSVM).
Spiral book in Python! Python with a spiral book! (Chapter 14 ~)
Amplify images for machine learning with python
Machine learning with python (2) Simple regression analysis
[python] Frequently used techniques in machine learning
Python: Preprocessing in machine learning: Data acquisition
[Shakyo] Encounter with Python for machine learning
Scraping a website using JavaScript in Python
What I found in "Mention to people who have not pressed reaction using slack API" [Mainly about json]
A story about Python pop and append
Draw a tree in Python 3 using graphviz
[Python] Saving learning results (models) in machine learning