Last time, I focused on mathematical discussions and lost track of what I was writing for. This time, we will implement multiple regression in Python.
Making it yourself is essential to getting used to Python grammar and Numpy. I think that you can deepen your understanding by referring to the implementation of Part 1.
The weights are given as $ \ beta = (X ^ TX) ^ {-1} X ^ Ty $, which is the solution of the normal equation $ X ^ Ty = X ^ TX \ beta $. We don't solve $ y = X \ beta $ because $ X $ is not a square matrix in general, so $ X $ may not have an inverse matrix. If $ X ^ TX $ is not regular, an error will occur, but let's just calculate with $ \ beta = (X ^ TX) ^ {-1} X ^ Ty $.
model = LinearRegression()
model.fit(X_train, y)
model.predict(X_test)
Let's make the skeleton. Calculating the matrix-matrix or matrix-vector product is very easy with Numpy's numpy.dot (A, B).
import numpy as np
import matplotlib.pyplot as plt
Let's create a class of regression models that you are familiar with. I had a hard time because I wasn't used to Python grammar. For simplicity, the bias is added to the left column of the weight matrix. Along with that, a new element 1 is added to the left of the explanatory variable.
class LinearRegression():
def fit(self, X, y):
X_ = np.concatenate([np.ones(X.shape[0]).reshape(-1, 1), X], axis=1)
# X_ = np.c_[np.ones(X.shape[0]), X]Can also be written
self.beta = np.linalg.inv(np.dot(X_.T, X_)).dot(X_.T).dot(y)
def predict(self, X):
X_ = np.concatenate([np.ones(X.shape[0]).reshape(-1, 1), X], axis=1)
# X_ = np.c_[np.ones(X.shape[0]), X]Can also be written
return np.dot(X_, self.beta)
As an example, we use a sample with a distribution close to $ y = 2x_1 + 3x_2 + 4 $.
n = 100 #Prepare 100 samples
np.random.seed(0)
X_train = (np.random.random((n, 2)) - 0.5) * 20 #Generate an n * 2 matrix so that the elements are random numbers (range)-From 10 to 10)
y = 2*X_train[:, 0] + 3*X_train[:, 1] + 4 + np.random.randn(n) * 0.1
model = LinearRegression()
model.fit(X_train, y)
print("beta={}".format(model.beta))
When you do this
beta=[3.9916141 1.9978685 3.00014788]
Is output. beta [0] is the bias. You can see that it is a good approximation.
Let's visualize it. Let's start with a scatter plot of X_train.
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = Axes3D(fig)
ax.plot(X_train[:, 0], X_train[:, 1], y, marker="o", linestyle="None")
With this in mind, let's illustrate the predicted plane.
fig = plt.figure()
ax = Axes3D(fig)
ax.plot(X_train[:, 0], X_train[:, 1], y, marker="o", linestyle="None")
xmesh, ymesh = np.meshgrid(np.linspace(-10, 10, 20), np.linspace(-10, 10, 20))
zmesh = (model.beta[1] * xmesh.ravel() + model.beta[2] * ymesh.ravel() + model.beta[0]).reshape(xmesh.shape)
ax.plot_wireframe(xmesh, ymesh, zmesh, color="r")
This is for business use. I will implement it at once!
from sklearn.linear_model import LinearRegression
n = 100 #Prepare 100 samples
np.random.seed(0)
X_train = (np.random.random((n, 2)) - 0.5) * 20 #Generate an n * 2 matrix so that the elements are random numbers (range)-From 10 to 10)
y = 2*X_train[:, 0] + 3*X_train[:, 1] + 4 + np.random.randn(n) * 0.1
model = LinearRegression()
model.fit(X_train, y)
print("coef_, intercept_ = {}, {}".format(model.coef_, model.intercept_))
X_test = np.c_[np.linspace(-10, 10, 20), np.linspace(-10, 10, 20)]
model.predict(X_test)
fig = plt.figure()
ax = Axes3D(fig)
ax.plot(X_train[:, 0], X_train[:, 1], y, marker="o", linestyle="None")
xmesh, ymesh = np.meshgrid(X_test[:, 0], X_test[:, 1])
zmesh = (model.coef_[0] * xmesh.ravel() + model.coef_[1] * ymesh.ravel() + model.intercept_).reshape(xmesh.shape)
ax.plot_wireframe(xmesh, ymesh, zmesh, color="r")
Did you get the same figure?
Let's plot the residuals.
#Residual plot
y_train_pred = model.predict(X_train)
plt.scatter(y_train_pred, y_train_pred - y, marker="o", label="training data")
plt.scatter(y_test_pred, y_test_pred - (2*X_test[:, 0]+3*X_test[:, 1]+4), marker="^", label="test data")
plt.xlabel("predicted values")
plt.ylabel("residuals")
plt.hlines(y=0, xmin=-40, xmax=55, color="k")
plt.xlim([-40, 55])
There is much more to be learned about multiple regression, but this should be enough for practical work. Next time, we plan to return to Ridge and Lasso.
--Koichi Kato "Essence of Machine Learning" SB Creative, 2019 --S. Raschka, V. Mirjalili "Python Machine Learning Programming [2nd Edition]" Impress, 2020
Recommended Posts