Last time University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (5) https://github.com/legacyworld/sklearn-basic
Commentary on Youtube: 5th (1) Per 15 minutes 50 seconds It is almost the same as Exercise 4.1, but this time it is a problem to shake the regularization parameter larger ($ 10 ^ {-3} ~ 10 ^ {6} $) and see the effect on each coefficient. I think it's a good task because you can clearly see the difference between Ridge and Lasso. This is the source code.
python:Homework_4.2.py
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import linear_model
from sklearn import preprocessing
#scikit-Import wine data from lean
df= pd.read_csv('winequality-red.csv',sep=';')
#Since the target value quality is included, create a dropped dataframe
df1 = df.drop(columns='quality')
y = df['quality'].values.reshape(-1,1)
scaler = preprocessing.StandardScaler()
#Regularization parameters
alpha = 10 ** (-4)
X = df1.values
X_fit = scaler.fit_transform(X)
#DataFrame for storing results
df_ridge = pd.DataFrame(columns=np.append(df1.columns,'alpha'))
df_lasso = pd.DataFrame(columns=np.append(df1.columns,'alpha'))
while alpha <= 10 ** 6 + 1:
#Ridge regression
model_ridge = linear_model.Ridge(alpha=alpha)
model_ridge.fit(X_fit,y)
tmp_se = pd.Series(np.append(model_ridge.coef_[0],alpha),index=df_ridge.columns)
df_ridge = df_ridge.append(tmp_se,ignore_index=True)
#Lasso return
model_lasso = linear_model.Lasso(alpha=alpha)
model_lasso.fit(X_fit,y)
tmp_se = pd.Series(np.append(model_lasso.coef_,alpha),index=df_lasso.columns)
df_lasso = df_lasso.append(tmp_se,ignore_index=True)
alpha = alpha * 10 ** (0.1)
for column in df_ridge.drop(columns = 'alpha'):
plt.plot(df_ridge['alpha'],df_ridge[column])
plt.xscale('log')
plt.gca().invert_xaxis()
plt.savefig("ridge.png ")
plt.clf()
for column in df_lasso.drop(columns = 'alpha'):
plt.plot(df_lasso['alpha'],df_lasso[column])
plt.xscale('log')
plt.gca().invert_xaxis()
plt.savefig("lasso.png ")
By the way, while alpha <= 10 ** 6 + 1:
+ 1
is because the last $ 10 ^ {6} $ is not executed without it.
I think it's because it is engraved with ʻalpha = alpha * 10 ** (0.1)`.
This time, I made two changes when drawing.
Make only the X-axis logarithmic (Cartesian) plt.xscale ('log')
and reverse the X-axis (smaller on the right) plt.gca (). Invert_xaxis ()
Ridge regression (left) Lasso regression (right)
There is a clear difference between Ridge, where regularization gradually works, and Lasso, which converges to 0 from the one with the smaller coefficient. (The problem was from $ 10 ^ {-3} $, but I changed it because the ridge regression in the explanation was drawn from $ 10 ^ {-2} $)
In both Ridge and Lasso, we can see a pattern in which the absolute value of the coefficient increases while the regularization parameter increases. The graph below is drawn by extracting only the features that move. The ridge seems to move a lot, but the Y-axis scale is only 10 times different. I don't know if it's a movement that when the coefficient of a certain feature becomes smaller, something that wasn't noticeable until then comes to the surface. I didn't think there was such a movement, so machine learning was a good task that made me think that it was not straightforward.
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (1) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (2) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (3) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (4)
Recommended Posts