Implementation

Last time, I focused on mathematical discussions and lost track of what I was writing for. This time, we will implement multiple regression in Python.

Make it by yourself

Making it yourself is essential to getting used to Python grammar and Numpy. I think that you can deepen your understanding by referring to the implementation of Part 1.

The weights are given as $ \ beta = (X ^ TX) ^ {-1} X ^ Ty $, which is the solution of the normal equation $ X ^ Ty = X ^ TX \ beta $. We don't solve $ y = X \ beta $ because $ X $ is not a square matrix in general, so $ X $ may not have an inverse matrix. If $ X ^ TX $ is not regular, an error will occur, but let's just calculate with $ \ beta = (X ^ TX) ^ {-1} X ^ Ty $.

model = LinearRegression()
model.fit(X_train, y)
model.predict(X_test)

Let's make the skeleton. Calculating the matrix-matrix or matrix-vector product is very easy with Numpy's numpy.dot (A, B).

import numpy as np
import matplotlib.pyplot as plt

Let's create a class of regression models that you are familiar with. I had a hard time because I wasn't used to Python grammar. For simplicity, the bias is added to the left column of the weight matrix. Along with that, a new element 1 is added to the left of the explanatory variable.

class LinearRegression():
        
    def fit(self, X, y):
        
        X_ = np.concatenate([np.ones(X.shape[0]).reshape(-1, 1), X], axis=1)
        # X_ = np.c_[np.ones(X.shape[0]), X]Can also be written
        self.beta = np.linalg.inv(np.dot(X_.T, X_)).dot(X_.T).dot(y)
        
    def predict(self, X):
        X_ = np.concatenate([np.ones(X.shape[0]).reshape(-1, 1), X], axis=1)
        # X_ = np.c_[np.ones(X.shape[0]), X]Can also be written
        return np.dot(X_, self.beta)

As an example, we use a sample with a distribution close to $ y = 2x_1 + 3x_2 + 4 $.

n = 100   #Prepare 100 samples
np.random.seed(0)
X_train = (np.random.random((n, 2)) - 0.5)  * 20 #Generate an n * 2 matrix so that the elements are random numbers (range)-From 10 to 10)
y = 2*X_train[:, 0] + 3*X_train[:, 1] + 4 + np.random.randn(n) * 0.1

model = LinearRegression()
model.fit(X_train, y)
print("beta={}".format(model.beta))

When you do this

beta=[3.9916141 1.9978685 3.00014788]

Is output. beta [0] is the bias. You can see that it is a good approximation.

Let's visualize it. Let's start with a scatter plot of X_train.

from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure()
ax = Axes3D(fig)

ax.plot(X_train[:, 0], X_train[:, 1], y, marker="o", linestyle="None")

With this in mind, let's illustrate the predicted plane.

fig = plt.figure()
ax = Axes3D(fig)

ax.plot(X_train[:, 0], X_train[:, 1], y, marker="o", linestyle="None")

xmesh, ymesh = np.meshgrid(np.linspace(-10, 10, 20), np.linspace(-10, 10, 20))
zmesh = (model.beta[1] * xmesh.ravel() + model.beta[2] * ymesh.ravel() + model.beta[0]).reshape(xmesh.shape)
ax.plot_wireframe(xmesh, ymesh, zmesh, color="r")

Implemented in scikit-learn library (LinearRegression)

This is for business use. I will implement it at once!

from sklearn.linear_model import LinearRegression

n = 100   #Prepare 100 samples
np.random.seed(0)
X_train = (np.random.random((n, 2)) - 0.5)  * 20 #Generate an n * 2 matrix so that the elements are random numbers (range)-From 10 to 10)
y = 2*X_train[:, 0] + 3*X_train[:, 1] + 4 + np.random.randn(n) * 0.1

model = LinearRegression()
model.fit(X_train, y)
print("coef_, intercept_ = {}, {}".format(model.coef_, model.intercept_))

X_test = np.c_[np.linspace(-10, 10, 20), np.linspace(-10, 10, 20)]
model.predict(X_test)

fig = plt.figure()
ax = Axes3D(fig)
ax.plot(X_train[:, 0], X_train[:, 1], y, marker="o", linestyle="None")
xmesh, ymesh = np.meshgrid(X_test[:, 0], X_test[:, 1])
zmesh = (model.coef_[0] * xmesh.ravel() + model.coef_[1] * ymesh.ravel() + model.intercept_).reshape(xmesh.shape)
ax.plot_wireframe(xmesh, ymesh, zmesh, color="r")

Did you get the same figure?

Let's plot the residuals.

#Residual plot
y_train_pred = model.predict(X_train)
plt.scatter(y_train_pred, y_train_pred - y, marker="o", label="training data")
plt.scatter(y_test_pred, y_test_pred - (2*X_test[:, 0]+3*X_test[:, 1]+4), marker="^", label="test data")
plt.xlabel("predicted values")
plt.ylabel("residuals")
plt.hlines(y=0, xmin=-40, xmax=55, color="k")
plt.xlim([-40, 55])

There is much more to be learned about multiple regression, but this should be enough for practical work. Next time, we plan to return to Ridge and Lasso.

References

--Koichi Kato "Essence of Machine Learning" SB Creative, 2019 --S. Raschka, V. Mirjalili "Python Machine Learning Programming [2nd Edition]" Impress, 2020

[PYTHON] Basics of Supervised Learning Part 3-Multiple Regression (Implementation)-(Notes)-

Implementation

Make it by yourself

Implemented in scikit-learn library (LinearRegression)

References