University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (16)

Last time University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (15) https://github.com/legacyworld/sklearn-basic

Challenge 7.3 Comparison of Gradient Method and Newton's Method in Logistic Regression

Commentary on Youtube: 8th (1) per 27 minutes I couldn't get the result explained in the lecture, probably because the implementation of the re-descent method was bad. I tried it with ridge regression etc., but the result did not change much.

Mathematically, it implements the following:

The steepest descent method

E(w) = -\frac{1}{N}\sum_{n=1}^{N}{t_n\,\log\hat t_n + (1-t_n)\,\log(1-\hat t_n)}\\
\frac{\partial E(w)}{\partial w} = X^T(\hat t-t) \\
w \leftarrow w - \eta X^T(\hat t-t)

In the iris data, $ N = 150 $ and $ w $ are 5 dimensions by adding the intercept. The reason why $ E (w) $ is divided by $ N $ is that the initial cost does not match unless this is done. However, even with this, only 0.1 diverged, and other than that, it converged properly. I think something is wrong, but I'm not sure.

Newton's method

\nabla \nabla E(w) = X^TRX\,,R=\hat{t}_n(1-\hat{t}_n)Diagonal matrix\\
w \leftarrow w-(X^TRX)^{-1}X^T(\hat{t}-t)

The Newton method is generally correct, but the Final Cost is unfortunately quite different. It is also difficult to understand that only three weights are displayed in the results of the lecture.

Click here for source code Since the source code of Exercise 4.3 is diverted, it is classified by BaseEstimator, but there is no meaning there.

python:Homework_7.3.py


#Challenge 7.3 Comparison of gradient method and Newton's method in logistic regression
#Commentary on Youtube: 8th(1)Per 27 minutes
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import preprocessing,metrics
from sklearn.linear_model import LogisticRegression
from sklearn.base import BaseEstimator
import statsmodels.api as sm
from sklearn.datasets import load_iris  
iris = load_iris()

class MyEstimator(BaseEstimator):
    def __init__(self,ep,eta):
        self.ep = ep
        self.eta = eta
        self.loss = []

    def fit(self, X, y,f):
        m = len(y)
        loss = []
        diff = 10**(10)
        ep = self.ep
        #Types of features
        dim = X.T.shape[1]
        #Initial value of beta
        beta = np.zeros(dim).reshape(-1,1)
        eta = self.eta
        
        while abs(diff) > ep:
            t_hat = self.sigmoid(beta.T,X)
            loss.append(-(1/m)*np.sum(y*np.log(t_hat) + (1-y)*np.log(1-t_hat)))
            #The steepest descent method
            if f == "GD":
                beta = beta - eta*np.dot(X,(t_hat-y).reshape(-1,1))
            #Newton's method
            else:
                #Diagonal matrix of NxN
                R = np.diag((t_hat*(1-t_hat))[0])
                #Hessian matrix
                H = np.dot(np.dot(X,R),X.T)
                beta = beta - np.dot(np.linalg.inv(H),np.dot(X,(t_hat-y).reshape(-1,1)))
            if len(loss) > 1:
                diff = loss[len(loss)-1] - loss[len(loss)-2]
                if diff > 0:
                    break
        self.loss = loss
        self.coef_ = beta
        return self

    def sigmoid(self,w,x):
        return 1/(1+np.exp(-np.dot(w,x)))

#Graph
fig = plt.figure(figsize=(20,10))
ax = [fig.add_subplot(3,3,i+1) for i in range(9)]

#Just consider whether virginica or not
target = 2
X = iris.data
y = iris.target
# y =Not 2(Not virginica)If 0
y[np.where(np.not_equal(y,target))] = 0
y[np.where(np.equal(y,target))] = 1
scaler = preprocessing.StandardScaler()
X_fit = scaler.fit_transform(X)
X_fit = sm.add_constant(X_fit).T #Add 1 to the first column
epsilon = 10 ** (-8)
#The steepest descent method
eta_list = [0.1,0.01,0.008,0.006,0.004,0.003,0.002,0.001,0.0005]
for index,eta in enumerate(eta_list):
    myest = MyEstimator(epsilon,eta)
    myest.fit(X_fit,y,"GD")
    ax[index].plot(myest.loss)
    ax[index].set_title(f"Optimization with Gradient Descent\nStepsize = {eta}\nIterations:{len(myest.loss)}; Initial Cost is:{myest.loss[0]:.3f}; Final Cost is:{myest.loss[-1]:.6f}")
plt.tight_layout()    
plt.savefig(f"7.3GD.png ")

#Newton's method
myest.fit(X_fit,y,"newton")
plt.clf()
plt.plot(myest.loss) 
plt.title(f"Optimization with Newton Method\nInitial Cost is:{myest.loss[0]:.3f}; Final Cost is:{myest.loss[-1]:.6f}")
plt.savefig("7.3Newton.png ")

#Results from sklearn's Logistic Regression
X_fit = scaler.fit_transform(X)
clf = LogisticRegression(penalty='none')
clf.fit(X_fit,y)
print(f"accuracy_score = {metrics.accuracy_score(clf.predict(X_fit),y)}")
print(f"coef = {clf.coef_} intercept = {clf.intercept_}")

Results of the steepest descent method

In the lecture, the step parameter diverged up to 0.003 and became the minimum at 0.002, but the result was completely different. 7.3GD.png

Newton's method results

Final Cost is one digit smaller, but the number of times is about the same as the lecture. It doesn't seem that wrong. 7.3Newton.png

Results of sklearn's Logistics Regression

accuracy_score = 0.9866666666666667
coef = [[-2.03446841 -2.90222851 16.58947002 13.89172352]] intercept = [-20.10133936]

result

Obtained parameters are as follows The re-descent method is the result when the final cost is the smallest step size = 0.01

The steepest descent method:(w_0,w_1,w_2,w_3,w_4) = (-18.73438888,-1.97839772,-2.69938233,15.54339092,12.96694841)\\
Newton's method:(w_0,w_1,w_2,w_3,w_4) = (-20.1018028,-2.03454941,-2.90225059,16.59009858,13.89184339)\\
sklearn:(w_0,w_1,w_2,w_3,w_4) = (-20.10133936,-2.03446841,-2.90222851,16.58947002,13.89172352)

The Newton's method is certainly fast, but inverse matrix calculation is essential, so if the number of dimensions or the number of samples increases, will it eventually settle into the stochastic re-descent method?

Past posts

University of Tsukuba Machine Learning Course: Study and strengthen sklearn while creating the Python script part of the task (1) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (2) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (3) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (4) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (5) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (6) University of Tsukuba Machine Learning Course: Study sklearn while making the Python script part of the task (7) Make your own steepest descent method University of Tsukuba Machine Learning Course: Study sklearn while making the Python script part of the task (8) Make your own stochastic steepest descent method University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (9) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (10) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (11) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (12) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (13) University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the assignment (14) https://github.com/legacyworld/sklearn-basic https://ocw.tsukuba.ac.jp/course/systeminformation/machine_learning/

Recommended Posts

University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (17)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (5)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (16)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (10)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (2)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (13)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (9)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (4)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (12)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (1)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (11)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (3)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (14)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (6)
University of Tsukuba Machine Learning Course: Study sklearn while creating the Python script part of the task (15)
University of Tsukuba Machine Learning Course: Study sklearn while making the Python script part of the task (7) Make your own steepest descent method
University of Tsukuba Machine Learning Course: Study sklearn while making the Python script part of the task (8) Make your own stochastic steepest descent method
Python & Machine Learning Study Memo ⑤: Classification of irises
Python & Machine Learning Study Memo ②: Introduction of Library
Image collection Python script for creating datasets for machine learning
Summary of the basic flow of machine learning with Python
The result of Java engineers learning machine learning in Python www
[Machine learning pictorial book] A memo when performing the Python exercise at the end of the book while checking the data
Python learning memo for machine learning by Chainer until the end of Chapter 2
Python & Machine Learning Study Memo: Environment Preparation
Learning notes from the beginning of Python 1
I installed Python 3.5.1 to study machine learning
Python Basic Course (at the end of 15)
Python & Machine Learning Study Memo ③: Neural Network
Python & Machine Learning Study Memo ④: Machine Learning by Backpropagation
Learning notes from the beginning of Python 2
Python & Machine Learning Study Memo ⑥: Number Recognition
Align the number of samples between classes of data for machine learning with Python
Introducing the book "Creating a profitable AI with Python" that allows you to learn machine learning in the shortest course
Machine learning memo of a fledgling engineer Part 1
[Python] Read the source code of Bottle Part 2
Classification of guitar images by machine learning Part 1
Machine learning starting with Python Personal memorandum Part2
The story of low learning costs for Python
2016 The University of Tokyo Mathematics Solved with Python
Machine learning starting with Python Personal memorandum Part1
Upgrade the Azure Machine Learning SDK for Python
EV3 x Python Machine Learning Part 2 Linear Regression
[Python] Read the source code of Bottle Part 1
About the development contents of machine learning (Example)
Machine learning memo of a fledgling engineer Part 2
Classification of guitar images by machine learning Part 2
Get a glimpse of machine learning in Python
Python & Machine Learning Study Memo ⑦: Stock Price Forecast
[Python + OpenCV] Whiten the transparent part of the image
Predicting the goal time of a full marathon with machine learning-③: Visualizing data with Python-
The first step of machine learning ~ For those who want to implement with python ~
[CodeIQ] I wrote the probability distribution of dice (from CodeIQ math course for machine learning [probability distribution])
[Machine learning] "Abnormality detection and change detection" Let's draw the figure of Chapter 1 in Python.