University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (4)

University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (1) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (2) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (3) https://github.com/legacyworld/sklearn-basic

Exercise 3.4 Regularization of polynomial simple regression

By the way, this time is ridge regression. It seems that there is no explanation on Youtube, but some results are around 9 minutes and 45 seconds in the 4th (1). The source code is almost the same as Task 3.2. In this task, the order is fixed at 30. Instead, move the regularization parameter with [1e-30,1e-20,1e-10,1e-5, 1e-3,1e-2,1e-1, 1,10,100]. In the source code, the regularization parameter is ʻalpha. Because Python's lambda` is reserved.

python:Homework_3.4.py


import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import PolynomialFeatures as PF
from sklearn import linear_model
from sklearn.pipeline import Pipeline
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import cross_val_score

DEGREE = 30

def true_f(x):
    return np.cos(1.5 * x * np.pi)

np.random.seed(0)
n_samples = 30

#X-axis data for drawing
x_plot = np.linspace(0,1,100)
#Training data
x_tr = np.sort(np.random.rand(n_samples))
y_tr = true_f(x_tr) + np.random.randn(n_samples) * 0.1
#Convert to Matrix
X_tr = x_tr.reshape(-1,1)
X_plot = x_plot.reshape(-1,1)
degree = DEGREE
alpha_list = [1e-30,1e-20,1e-10,1e-5, 1e-3,1e-2,1e-1, 1,10,100]
for alpha in alpha_list:
    plt.scatter(x_tr,y_tr,label="Training Samples")
    plt.plot(x_plot,true_f(x_plot),label="True")
    plt.xlim(0,1)
    plt.ylim(-2,2)
    filename = f"{alpha}.png "
    pf = PF(degree=degree,include_bias=False)
    linear_reg = linear_model.Ridge(alpha=alpha)
    steps = [("Polynomial_Features",pf),("Linear_Regression",linear_reg)]
    pipeline = Pipeline(steps=steps)
    pipeline.fit(X_tr,y_tr)
    plt.plot(x_plot,pipeline.predict(X_plot),label="Model")
    y_predict = pipeline.predict(X_tr)
    mse = mean_squared_error(y_tr,y_predict)
    scores = cross_val_score(pipeline,X_tr,y_tr,scoring="neg_mean_squared_error",cv=10)
    plt.title(f"Degree: {degree}, Lambda: {alpha}\nTrainErr: {mse:.2e} TestErr: {-scores.mean():.2e}(+/- {scores.std():.2e})")
    plt.legend()
    plt.savefig(filename)
    plt.clf()
    print(f"Regularization parameters= {alpha},Training error= {mse},Test error= {-scores.mean():.2e}")

What is changing

linear_reg = linear_model.Ridge(alpha=alpha)

Only. When I run this program, I get the following Warning.

/usr/local/lib64/python3.6/site-packages/sklearn/linear_model/_ridge.py:190: UserWarning: Singular matrix in solving dual problem. Using least-squares solution instead.
  warnings.warn("Singular matrix in solving dual problem. Using "

This seems to come out in the place of Cholesky decomposition, but I was not sure.

linear_model/ridge.py


try:
   # Note: we must use overwrite_a=False in order to be able to
   #       use the fall-back solution below in case a LinAlgError
   #       is raised
   dual_coef = linalg.solve(K, y, sym_pos=True,overwrite_a=False)
except np.linalg.LinAlgError:
   warnings.warn("Singular matrix in solving dual problem. Using "
   "least-squares solution instead.")
   dual_coef = linalg.lstsq(K, y)[0]

It seems that the least squares method is used because the simultaneous equations cannot be solved ... I didn't understand the reason why it came out only with 1e-30,1e-20.

Execution result

Regularization parameters= 1e-30,Training error= 0.002139325105436034,Test error= 5.11e+02
Regularization parameters= 1e-20,Training error= 0.004936191193133389,Test error= 5.11e+02
Regularization parameters= 1e-10,Training error= 0.009762751388489265,Test error= 1.44e+02
Regularization parameters= 1e-05,Training error= 0.01059565315043209,Test error= 2.79e-01
Regularization parameters= 0.001,Training error= 0.010856091742299396,Test error= 6.89e-02
Regularization parameters= 0.01,Training error= 0.012046102850453813,Test error= 7.79e-02
Regularization parameters= 0.1,Training error= 0.02351033489834412,Test error= 4.94e-02
Regularization parameters= 1,Training error= 0.11886509938269865,Test error= 2.26e-01
Regularization parameters= 10,Training error= 0.31077333649742883,Test error= 4.71e-01
Regularization parameters= 100,Training error= 0.41104732329314453,Test error= 5.20e-01

--Minimum training error --Regularization parameter = 1e-30 - 1e-30.png

--Minimum test error --Regularization parameter = 0.1 --In the explanation, 0.01 is the best, but in this program, 0.1 had a smaller test error. 0.1.png

If regularization is too effective, it will only take the average, and if it is not effective, it will be overfitting. Therefore, cross-validation is necessary for tuning regularization parameters even in ridge regression.

Recommended Posts

University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (17)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (5)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (16)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (10)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (13)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (9)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (4)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (12)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (1)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (11)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (3)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (14)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (6)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (15)
University of Tsukuba Machine Learning Course: Study sklearn while making the Python script part of the task (7) Make your own steepest descent method
University of Tsukuba Machine Learning Course: Study sklearn while making the Python script part of the task (8) Make your own stochastic steepest descent method
Python & Machine Learning Study Memo ⑤: Classification of irises
Python & Machine Learning Study Memo ②: Introduction of Library
Image collection Python script for creating datasets for machine learning
Summary of the basic flow of machine learning with Python
The result of Java engineers learning machine learning in Python www
[Machine learning pictorial book] A memo when performing the Python exercise at the end of the book while checking the data
Python learning memo for machine learning by Chainer until the end of Chapter 2
Python & Machine Learning Study Memo: Environment Preparation
Learning notes from the beginning of Python 1
I installed Python 3.5.1 to study machine learning
Python Basic Course (at the end of 15)
Python & Machine Learning Study Memo ③: Neural Network
Python & Machine Learning Study Memo ④: Machine Learning by Backpropagation
Learning notes from the beginning of Python 2
Python & Machine Learning Study Memo ⑥: Number Recognition
Align the number of samples between classes of data for machine learning with Python
Introducing the book "Creating a profitable AI with Python" that allows you to learn machine learning in the shortest course
[Python] Read the source code of Bottle Part 2
Classification of guitar images by machine learning Part 1
Machine learning starting with Python Personal memorandum Part2
The story of low learning costs for Python
2016 The University of Tokyo Mathematics Solved with Python
Machine learning starting with Python Personal memorandum Part1
Upgrade the Azure Machine Learning SDK for Python
EV3 x Python Machine Learning Part 2 Linear Regression
[Python] Read the source code of Bottle Part 1
About the development contents of machine learning (Example)
Machine learning memo of a fledgling engineer Part 2
Classification of guitar images by machine learning Part 2
Get a glimpse of machine learning in Python
Python & Machine Learning Study Memo ⑦: Stock Price Forecast
[Python + OpenCV] Whiten the transparent part of the image
Predicting the goal time of a full marathon with machine learning-③: Visualizing data with Python-
The first step of machine learning ~ For those who want to implement with python ~
[CodeIQ] I wrote the probability distribution of dice (from CodeIQ math course for machine learning [probability distribution])
[Machine learning] "Abnormality detection and change detection" Let's draw the figure of Chapter 1 in Python.