I'm an amateur on the 14th day of python, but I want to try machine learning with scikit-learn

It's just a blog

Thing you want to do

I have teacher data like a table, and I want to make a machine learning program that predicts whether it is a programmer or not when given features and play with it

sex age Profession
Man 28 Programmer
woman 20 Not a programmer
Man 32 Programmer
Man 67 Not a programmer
woman 8 Programmer
sex age Profession
woman 28 ?

knowledge

Machine learning can be broadly divided into supervised learning and unsupervised learning, and what I want to do this time is those with supervised learning. There are ** regression ** and ** classification ** in supervised machine learning methods, and the one that should be used this time is classification (I think). ** Regression ** seems to be used to predict numbers from data, ** Classification ** seems to be used to predict classification from data I want to classify whether or not I am a programmer based on gender and age.

environment

python 2.7.10 scikit-learn 0.19.0

Try using scikit-learn

It seems that scikit-learn has a sample data set as teacher data, so I will use it for the time being スクリーンショット 2017-09-01 17.31.53.png

I will try using iris. It is the data of the flower called iris. The flow trains the data of the iris varieties that are paired with the feature data, If you give a feature amount and predict the classification (variety), you will achieve the goal for the time being

learn_iris.py


from sklearn import datasets
#Read sample data
iris = datasets.load_iris()

iris.data is the feature sample data iris.target is the classification data

>>> iris.data  #Feature data Sepal(Sepals), Petal(petal)Width of, Petal(petal)Length of
array([[ 5.1,  3.5,  1.4,  0.2],
       [ 4.9,  3. ,  1.4,  0.2],
       [ 4.7,  3.2,  1.3,  0.2],
       [ 4.6,  3.1,  1.5,  0.2],
       [ 5. ,  3.6,  1.4,  0.2],
       ...,
       [ 5.9,  3. ,  5.1,  1.8]])

>>> iris.target  #Data of varieties paired with features(Types of irises) 0:"setosa”, 1:“versicolor”, 2:“virginica”
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

It will learn with the fit method If you give a feature with the predict method, it will predict the type.

I don't know what svm.SVC () is doing. It seems to be a learning model of supervised learning called a support vector machine.

This time, I will give the features of setosa and try to classify it properly.

learn_iris.py


from sklearn import datasets
#Read sample data
iris = datasets.load_iris()

#Learning
clf = svm.SVC()
clf.fit(iris.data, iris.target)

#Give the setosa features and try to classify them properly
test_data = [[ 5.1,  3.5,  1.4,  0.2]]
print(clf.predict(test_data))

result


[0]

They classified it safely!

Try playing

Create teacher data based on the table at the beginning and ask a 28-year-old woman to estimate whether she is a programmer

learn_pg.py


from sklearn import datasets, svm
#Feature data[0:Man 1:woman,age]
feature = [
    [0, 28],
    [1, 20],
    [0, 32],
    [0, 67],
    [1, 8]
]
#Correct answer data 0:Not a programmer 1:Programmer
job = [1, 0, 1, 0, 1]

#Predictive data Woman 28 years old
test_data = [[1, 28]]

#Learning
clf = svm.SVC()
clf.fit(feature, job)

print("Programmer" if clf.predict(test_data)[0] else "Programmerじゃない")

result


Programmer

It seems to be a programmer! It seems that you can choose a learning model and know the correct answer rate for each model, but for the time being, this is the end. Even humans who have neither python knowledge nor machine learning knowledge could do machine learning.

Recommended Posts

I'm an amateur on the 14th day of python, but I want to try machine learning with scikit-learn
I want to send Gmail with Python, but I can't because of an error
The first step of machine learning ~ For those who want to implement with python ~
I want to extract an arbitrary URL from the character string of the html source with python
I want to use Python in the environment of pyenv + pipenv on Windows 10
I want to inherit to the back with python dataclass
I want to AWS Lambda with Python on Mac!
Summary of the basic flow of machine learning with Python
I want to plot the location information of GTFS Realtime on Jupyter! (With balloon)
An easy way to pad the number with zeros depending on the number of digits [Python]
I tried to build an environment for machine learning with Python (Mac OS X)
The 15th offline real-time I tried to solve the problem of how to write with python
I'm tired of Python, so I tried to analyze the data with nehan (I want to go live even with corona sickness-Part 2)
The first step for those who are amateurs of statistics but want to implement machine learning models in Python
I'm tired of Python, so I tried to analyze the data with nehan (I want to go live even with corona sickness-Part 1)
I tried to find the entropy of the image with python
I want to specify another version of Python with pyvenv
Try to evaluate the performance of machine learning / regression model
Try to evaluate the performance of machine learning / classification model
[Python] I want to use the -h option with argparse
[Machine learning] I tried to summarize the theory of Adaboost
I want to know the features of Python and pip
[Python] I want to make a 3D scatter plot of the epicenter with Cartopy + Matplotlib!
[For beginners] I want to explain the number of learning times in an easy-to-understand manner.
I made an API with Docker that returns the predicted value of the machine learning model
Try machine learning with scikit-learn SVM
I want to debug with Python
I want to know the weather with LINE bot feat.Heroku + Python
I want to check the position of my face with OpenCV!
I tried to improve the efficiency of daily work with Python
I tried to make Othello AI with tensorflow without understanding the theory of machine learning ~ Introduction ~
I tried to make Othello AI with tensorflow without understanding the theory of machine learning ~ Implementation ~
[Python] Visualize Arashi's lyrics with WordCloud and try to understand what I wanted to convey to fans in the 20th year of formation.
[Python] I tried to visualize the night on the Galactic Railroad with WordCloud!
Put Cabocha 0.68 on Windows and try to analyze the dependency with Python
I want to do it with Python lambda Django, but I will stop
I installed Python 3.5.1 to study machine learning
Python Note: When you want to know the attributes of an object
Try to image the elevation data of the Geographical Survey Institute with Python
(Python Selenium) I want to check the settings of the download destination of WebDriver
I tried to output the rpm list of SSH login destination to an Excel sheet with Python + openpyxl.
I want to explain the abstract class (ABCmeta) of Python in detail.
I tried to get the authentication code of Qiita API with Python.
I want to analyze logs with Python
I want to play with aws with python
I want to color a part of an Excel string in Python
I tried with the top 100 PyPI packages> I tried to graph the packages installed on Python
A beginner of machine learning tried to predict Arima Kinen with python
I tried to streamline the standard role of new employees with Python
I tried to visualize the model with the low-code machine learning library "PyCaret"
An introduction to Python for machine learning
Memorandum of means when you want to make machine learning with 50 images
I want to stop the automatic deletion of the tmp area with RHEL7
I tried to get the movie information of TMDb API with Python
Python: I want to measure the processing time of a function neatly
I started machine learning with Python (I also started posting to Qiita) Data preparation
Try to poke DB on IBM i with python + JDBC using JayDeBeApi
I tried to make Othello AI with tensorflow without understanding the theory of machine learning ~ Battle Edition ~
[Python] I tried to analyze the characteristics of thumbnails that are easy to play on YouTube by deep learning
Try to measure the position of the object on the desk (real coordinate system) from the camera image with Python + OpenCV
The file edited with vim was readonly but I want to save it