University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (2)

Continuation from the last time University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (1) https://github.com/legacyworld/sklearn-basic

Challenge 3.1 Polynomial regression squared error

This is a problem of creating training data with an error of $ N (0,1) \ times0.1 $ on $ y = \ sin (x) $ and regressing it with a polynomial. Explanation is the 3rd (1) per 56 minutes 40 seconds

python:Homework_3.1.py


import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import copy
from sklearn.preprocessing import PolynomialFeatures as PF
from sklearn import linear_model
from sklearn.metrics import mean_squared_error
#Number of training data
NUM_TR = 6
np.random.seed(0)
rng = np.random.RandomState(0)

#X-axis data for drawing
x_plot = np.linspace(0,10,100)
#Training data
tmp = copy.deepcopy(x_plot)
rng.shuffle(tmp)
x_tr = np.sort(tmp[:NUM_TR])
y_tr = np.sin(x_tr) + 0.1*np.random.randn(NUM_TR)

#Convert to Matrix
X_tr = x_tr.reshape(-1,1)
X_plot = x_plot.reshape(-1,1)

#Data for polynomials
#Degree
degree = 1
pf = PF(degree=degree)
X_poly = pf.fit_transform(X_tr)
X_plot_poly = pf.fit_transform(X_plot)

model = linear_model.LinearRegression()
model.fit(X_poly,y_tr)
fig = plt.figure()
plt.scatter(x_tr,y_tr,label="training Samples")
plt.plot(x_plot,model.predict(X_plot_poly),label=f"degree = {degree}")
plt.legend()
plt.ylim(-2,2)
fig.savefig(f"{degree}.png ")

#Data for polynomials
#All orders
fig = plt.figure()
plt.scatter(x_tr,y_tr,label="Training Samples")

for degree in range(1,NUM_TR):
    pf = PF(degree=degree)
    X_poly = pf.fit_transform(X_tr)
    X_plot_poly = pf.fit_transform(X_plot)
    model = linear_model.LinearRegression()
    model.fit(X_poly,y_tr)
    plt.plot(x_plot,model.predict(X_plot_poly),label=f"degree {degree}")
    plt.legend()
    mse = mean_squared_error(y_tr,model.predict(X_poly))
    print(f"degree = {degree} mse = {mse}")

plt.xlim(0,10)
plt.ylim(-2,2)
fig.savefig('all_degree.png')

We have prepared two data (x_tr) for calculating regression and one for graph drawing (x_plot). If you simply do x_tr = x_plot, the actual data will not be copied. If you do it as it is, the number of drawing data will also be NUM_TR in the part ofx_tr = np.sort (tmp [: NUM_TR]), and the graph drawing will be strange. So I use deepcopy.

The original data is prepared by dividing 0-10 into 100 equal parts. Randomly select only NUM_TR of the training data (6 in the course) As an error, the random number generated between 0-1 multiplied by 1/10 is added to sin (x_tr). Since the seed is fixed at the beginning, the same result is obtained no matter how many times it is executed in any environment. This is the prepared data training.png

What is different from the past is the part called Polynomial Features. This is the part that prepares training data such as $ x, x ^ 2, x ^ 3, x ^ 4 $ for the degree of the polynomial. For example, if the order = 3, then this is the case.

degree = 3
pf = PF(degree=degree)
X_poly = pf.fit_transform(X_tr)
print(f"degree = {degree}\nX_Tr = {X_tr}\nX_poly = {X_poly}")

The execution result is

degree = 3
X_Tr = [[0.2020202 ]
 [2.62626263]
 [5.55555556]
 [7.57575758]
 [8.68686869]
 [9.39393939]]
X_poly = [[1.00000000e+00 2.02020202e-01 4.08121620e-02 8.24488122e-03]
 [1.00000000e+00 2.62626263e+00 6.89725538e+00 1.81140040e+01]
 [1.00000000e+00 5.55555556e+00 3.08641975e+01 1.71467764e+02]
 [1.00000000e+00 7.57575758e+00 5.73921028e+01 4.34788658e+02]
 [1.00000000e+00 8.68686869e+00 7.54616876e+01 6.55525771e+02]
 [1.00000000e+00 9.39393939e+00 8.82460973e+01 8.28978490e+02]]

In the first data, the original training data is $ x = 2.020202 \ times10 ^ {-1} $ and $ x ^ 2 = 4.08 \ times10 ^ {-2} $. The point is that $ x ^ 2 and x ^ 3 $ are treated as different features.

Next, let's regress in the first order (straight line). The result is this. 1.png

Finally, change the order by 1-5 and draw each on the graph. all_degree.png

This is the error. The numbers did not match by degree = 5.

degree = 1 mse = 0.33075005001856256
degree = 2 mse = 0.3252271169458752
degree = 3 mse = 0.30290034474812344
degree = 4 mse = 0.010086018410257538
degree = 5 mse = 3.1604543144050787e-22

Recommended Posts

University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (17)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (5)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (16)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (10)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (2)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (13)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (9)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (4)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (12)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (1)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (11)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (3)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (14)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (6)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (15)
University of Tsukuba Machine Learning Course: Study sklearn while making the Python script part of the task (7) Make your own steepest descent method
University of Tsukuba Machine Learning Course: Study sklearn while making the Python script part of the task (8) Make your own stochastic steepest descent method
Python & Machine Learning Study Memo ⑤: Classification of irises
Python & Machine Learning Study Memo ②: Introduction of Library
Image collection Python script for creating datasets for machine learning
Summary of the basic flow of machine learning with Python
The result of Java engineers learning machine learning in Python www
[Machine learning pictorial book] A memo when performing the Python exercise at the end of the book while checking the data
Python learning memo for machine learning by Chainer until the end of Chapter 2
Python & Machine Learning Study Memo: Environment Preparation
Learning notes from the beginning of Python 1
I installed Python 3.5.1 to study machine learning
Python Basic Course (at the end of 15)
Python & Machine Learning Study Memo ③: Neural Network
Python & Machine Learning Study Memo ④: Machine Learning by Backpropagation
Learning notes from the beginning of Python 2
Python & Machine Learning Study Memo ⑥: Number Recognition
Align the number of samples between classes of data for machine learning with Python
Introducing the book "Creating a profitable AI with Python" that allows you to learn machine learning in the shortest course
[Python] Read the source code of Bottle Part 2
Classification of guitar images by machine learning Part 1
Machine learning starting with Python Personal memorandum Part2
The story of low learning costs for Python
2016 The University of Tokyo Mathematics Solved with Python
Machine learning starting with Python Personal memorandum Part1
Upgrade the Azure Machine Learning SDK for Python
EV3 x Python Machine Learning Part 2 Linear Regression
[Python] Read the source code of Bottle Part 1
About the development contents of machine learning (Example)
Machine learning memo of a fledgling engineer Part 2
Classification of guitar images by machine learning Part 2
Get a glimpse of machine learning in Python
Python & Machine Learning Study Memo ⑦: Stock Price Forecast
[Python + OpenCV] Whiten the transparent part of the image
Predicting the goal time of a full marathon with machine learning-③: Visualizing data with Python-
The first step of machine learning ~ For those who want to implement with python ~
[CodeIQ] I wrote the probability distribution of dice (from CodeIQ math course for machine learning [probability distribution])
[Machine learning] "Abnormality detection and change detection" Let's draw the figure of Chapter 1 in Python.