University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (9)

Last time University of Tsukuba Machine Learning Course: Study sklearn while making the Python script part of the task (8) Make your own stochastic steepest descent method https://github.com/legacyworld/sklearn-basic

Challenge 5.3 Hinge loss and square loss

We classify the two chunks, but the problem is divided into those with and without outliers that deviate significantly from the chunks.

Hinge loss with no outliers

Youtube commentary is about 54 minutes of the 6th (1) The hinge loss of the one with no outlier is actually the same as the example of scikit-learn. It was more difficult to use matplotlib than the SVM part. The original story is this https://scikit-learn.org/stable/auto_examples/svm/plot_separating_hyperplane.html#sphx-glr-auto-examples-svm-plot-separating-hyperplane-py I added a comment for my own learning.

python:Homework_5.3_hinge_no_outlier.py


import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm
from sklearn.datasets import make_blobs

#Create 40 random classification datasets Specify the number of chunks in centers
X, y = make_blobs(n_samples=40, centers=2, random_state=6)
# kernel='linear'The larger the hinge loss C, the less effective regularization is.
clf = svm.SVC(kernel='linear', C=1000)
clf.fit(X, y)
#Draw classification data. The color is decided by the cmap part.
plt.scatter(X[:, 0], X[:, 1], c=y, s=30, cmap=plt.cm.Paired)

#Drawing the decision boundary
ax = plt.gca()
xlim = ax.get_xlim()
ylim = ax.get_ylim()

#Make a 30x30 grid
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
#Classification at each grid
Z = clf.decision_function(xy).reshape(XX.shape)

#Draw decision boundaries using contour lines level=0 corresponds to it
ax.contour(XX, YY, Z, colors='k', levels=[-1, 0, 1], alpha=0.5,
           linestyles=['--', '-', '--'])
#Draw the support vector with the smallest margin
ax.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1], s=100,
           linewidth=1, facecolors='none', edgecolors='k')
plt.savefig("5.3.png ")

Only two lines are actually calculated. The result looks like this. 5.3.png

Square loss with no outliers

I could understand this because there was a source code in the commentary video, but it was impossible without it.

python:Homework_5.3_square_no_outlier.py


import numpy as np
import matplotlib.pyplot as plt
from sklearn import linear_model
from sklearn.datasets import make_blobs

#Create 40 random classification datasets Specify the number of chunks in centers
X, y = make_blobs(n_samples=40, centers=2, random_state=6)
#The value of y-1,Set to 1
y = y*2-1
#Square loss
clf = linear_model.LinearRegression(fit_intercept=True,normalize=True,copy_X=True)
clf.fit(X, y)
#Draw classification data. The color is decided by the cmap part.
plt.scatter(X[:, 0], X[:, 1], c=y, s=30, cmap=plt.cm.Paired)

#Drawing the decision boundary
x_plot = np.linspace(4,10,100)
w = [clf.intercept_,clf.coef_[0],clf.coef_[1]]
y_plot = -(w[1]/w[2]) * x_plot - w[0]/w[2]
plt.plot(x_plot,y_plot)
plt.savefig("5.3.png ")

The idea is to perform linear multiple regression with $ X $ created by make_blobs as the feature quantity (2 types) and $ y $ as the target quantity. In this example, the number of feature samples is 40. In the graph above, the horizontal axis is $ x_1 $ and the vertical axis is $ x_2 $.

y = w_0 + w_1\times x_1 + w_2\times x_2

Can be expressed as. In make_blobs, $ y = 0,1 $, but this is changed to $ y = -1,1 $ by y = y * 2-1. The decision boundary can be drawn by setting $ y = 0 $.

0 = w_0 + w_1\times x_1 + w_2\times x_2 \\
x_2 = -\frac{w_0}{w_2} - \frac{w_1}{w_2}x_1

This is the last part of the source code. This is what I drew. 5.3.png

Intent of this issue

If there are no large outliers, similar results can be obtained with both hinge loss and square loss. However, when there are large outliers, the squared loss overestimates the loss of the outliers, making it impossible to obtain correct results. This will be explained next time.

Past posts

University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (1) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (2) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (3) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (4) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (5) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (6) University of Tsukuba Machine Learning Course: Study sklearn while making the Python script part of the task (7) Make your own steepest descent method https://github.com/legacyworld/sklearn-basic https://ocw.tsukuba.ac.jp/course/systeminformation/machine_learning/

Recommended Posts

University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (17)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (5)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (16)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (10)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (2)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (9)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (4)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (12)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (1)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (11)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (3)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (14)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (6)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (15)
University of Tsukuba Machine Learning Course: Study sklearn while making the Python script part of the task (7) Make your own steepest descent method
University of Tsukuba Machine Learning Course: Study sklearn while making the Python script part of the task (8) Make your own stochastic steepest descent method
Python & Machine Learning Study Memo ⑤: Classification of irises
Python & Machine Learning Study Memo ②: Introduction of Library
Image collection Python script for creating datasets for machine learning
Summary of the basic flow of machine learning with Python
The result of Java engineers learning machine learning in Python www
[Machine learning pictorial book] A memo when performing the Python exercise at the end of the book while checking the data
Python learning memo for machine learning by Chainer until the end of Chapter 2
Python & Machine Learning Study Memo: Environment Preparation
Learning notes from the beginning of Python 1
I installed Python 3.5.1 to study machine learning
Python Basic Course (at the end of 15)
Python & Machine Learning Study Memo ③: Neural Network
Python & Machine Learning Study Memo ④: Machine Learning by Backpropagation
Learning notes from the beginning of Python 2
Python & Machine Learning Study Memo ⑥: Number Recognition
Align the number of samples between classes of data for machine learning with Python
Machine learning memo of a fledgling engineer Part 1
[Python] Read the source code of Bottle Part 2
Classification of guitar images by machine learning Part 1
The story of low learning costs for Python
2016 The University of Tokyo Mathematics Solved with Python
Upgrade the Azure Machine Learning SDK for Python
EV3 x Python Machine Learning Part 2 Linear Regression
[Python] Read the source code of Bottle Part 1
About the development contents of machine learning (Example)
Machine learning memo of a fledgling engineer Part 2
Classification of guitar images by machine learning Part 2
Get a glimpse of machine learning in Python
Python & Machine Learning Study Memo ⑦: Stock Price Forecast
[Python + OpenCV] Whiten the transparent part of the image
Predicting the goal time of a full marathon with machine learning-③: Visualizing data with Python-
The first step of machine learning ~ For those who want to implement with python ~
[CodeIQ] I wrote the probability distribution of dice (from CodeIQ math course for machine learning [probability distribution])