[PYTHON] The tree.plot_tree of scikit-learn was very easy and convenient, so I tried to summarize how to use it easily.

The machine learning library scikit-learn implemented in Python is often used because it makes it easy to experiment with various algorithms. .. Speaking of flower shapes, TensorFlow and PyTorch are hard to use in a rigid field. .. .. With such scikit-learn, a function convenient for drawing after learning "decision tree", which is a typical method of supervised learning, has been implemented from Version 0.21.x, so I tried it while comparing it with the conventional method using GraphViz. It was.

Traditional visualization method: Using GraphViz

Previously, I installed and used another library called GraphViz. It takes a lot of time and effort. .. ..

Install GraphViz@Mac


brew install graphviz
pip install graphviz

Install GraphViz@Ubuntu


sudo apt install -y graphviz
pip install graphviz

Method using GraphViz


import graphviz
from sklearn import tree

iris = load_iris()
clf = DecisionTreeClassifier()
clf = clf.fit(iris.data, iris.target)

graph = graphviz.Source(tree.export_graphviz(clf, class_names=iris.feature_names, filled=True))
graph

Execution result

The execution result can be saved as PDF by executing graph.render ('decision_tree').

graphviz

Use tree.plot_tree

Let's draw a figure similar to the one drawn using GraphViz using tree.plot_tree. Since it is stored in the tree module of scikit-learn, no additional installation is required.

tree.plot_Method using tree


from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier, plot_tree
import matplotlib.pyplot as plt

iris = load_iris()
clf = DecisionTreeClassifier()
clf = clf.fit(iris.data, iris.target)
iris = load_iris()
plt.figure(figsize=(15, 10))
plot_tree(clf, feature_names=iris.feature_names, filled=True)
plt.show()

Execution result

I was able to output the same figure as the method using GraphViz. If you execute it on Jupyter Notebook, you can right-click the drawing result as it is and save it as an image.

plot_tree

Summary

Using scikit-learn's tree.plot_tree and traditional GraphViz method for visualization of decision trees, I found tree.plot_tree easier and more convenient (than traditional methods). I would like to actively utilize it in the future.

Reference

Recommended Posts

The tree.plot_tree of scikit-learn was very easy and convenient, so I tried to summarize how to use it easily.
I tried to summarize how to use matplotlib of python
I didn't understand the Resize of TensorFlow so I tried to summarize it visually.
I tried to summarize how to use the EPEL repository again
I tried to summarize the basic form of GPLVM
I tried to summarize how to use pandas in python
I tried to summarize the string operations of Python
I tried to scrape YouTube, but I can use the API, so don't do it.
I tried to understand how to use Pandas and multicollinearity based on the Affairs dataset.
A Python beginner made a chat bot, so I tried to summarize how to make it.
I tried to make it easy to change the setting of authenticated Proxy on Jupyter
I tried to use deep learning to extract the part where the plant is shown from the photo of the balcony, but it didn't work, so I will summarize the contents of trial and error. Part 2
(complex) It depends on how to name the coefficient of the morlet wavelet, the appropriate setting value, and the material, so I tried to organize it as much as possible.
[Machine learning] I tried to summarize the theory of Adaboost
[Qualification] I passed LinuC Level 1, so I will write about how to study and how it was.
I made a function to crop the image of python openCV, so please use it.
[Linux] I tried to summarize the command of resource confirmation system
I summarized how to change the boot parameters of GRUB and GRUB2
I tried to summarize the frequently used implementation method of pytest-mock
I tried using pyenv, which I hated without eating, and it was too convenient to sit down.
I tried to make a site that makes it easy to see the update information of Azure
[Qiita API] [Statistics • Machine learning] I tried to summarize and analyze the articles posted so far.
From the introduction of GoogleCloudPlatform Natural Language API to how to use it
I tried to summarize until I quit the bank and became an engineer
I tried to summarize the umask command
I tried to visualize the age group and rate distribution of Atcoder
When I tried to run Python, it was skipped to the Microsoft Store
I didn't understand the behavior of numpy's argsort, so I will summarize it.
[Cliff in 2025] The Ministry of Economy, Trade and Industry's "DX Report 2" was published, so I read it.
I tried how to improve the accuracy of my own Neural Network
I tried to summarize the graphical modeling.
I tried to summarize the logical way of thinking about object orientation.
I tried to extract and illustrate the stage of the story using COTOHA
I tried to verify and analyze the acceleration of Python by Cython
[Python] It was very convenient to use a Python class for a ROS program.
How to save the feature point information of an image in a file and use it for matching
In IPython, when I tried to see the value, it was a generator, so I came up with it when I was frustrated.
I wanted to know the number of lines in multiple files, so I tried to get it with a command
I tried to notify the update of "Hamelin" using "Beautiful Soup" and "IFTTT"
Somehow the code I wrote worked and I was impressed, so I will post it
P100-PCIE-16GB was added to the GPU of Google Colab before I knew it
I tried to use Resultoon on Mac + AVT-C875, but I was frustrated on the way.
Use Pillow to make the image transparent and overlay only part of it
I tried to easily visualize the tweets of JAWS DAYS 2017 with Python + ELK
I tried using Google Translate from Python and it was just too easy
I tried to rescue the data of the laptop by booting it on Ubuntu
Docker x visualization didn't work and I was addicted to it, so I summarized it!
I tried to use Twitter Scraper on AWS Lambda and it didn't work.
I set up TensowFlow and was addicted to it, so make a note
I tried to touch the API of ebay
I tried to correct the keystone of the image
LeetCode I tried to summarize the simple ones
How to install Cascade detector and how to use it
I tried to predict the price of ETF
I tried to vectorize the lyrics of Hinatazaka46!
[No code] I wrote about elliptic curves and blockchain in my thesis, so I tried to summarize the study method.
linux / c> link> Get the execution result of the shell command in the C program> I was taught how to use popen ()
GradCAM with 22 lines of code. tf_explain may be easy to use, I recommend it!
[LPIC 101] I tried to summarize the command options that are easy to make a mistake
[Linux] I learned LPIC lv1 in 10 days and tried to understand the mechanism of Linux.
I tried to notify the update of "Become a novelist" using "IFTTT" and "Become a novelist API"