[Python] Linear regression with scikit-learn

Synopsis up to the last time

Last time read the csv file and drew a scatter plot The completed code and figure looks like this

import numpy as np
import matplotlib.pyplot as plt

data_set = np.loadtxt(
    fname="sampleData.csv",
    dtype="float",
    delimiter=",",
)

#Draw a scatter plot → use scatter
#Take out line by line and draw
#plt.scatter(x coordinate value,y coordinate value)
for data in data_set:
    plt.scatter(data[0], data[1])

plt.title("correlation")
plt.xlabel("Average Temperature of SAITAMA")
plt.ylabel("Average Temperature of IWATE")
plt.grid()

plt.show()

scatter.png

What to do this time

Use scikit-learn to do a linear regression and draw a regression line

procedure##

1 Extract x-coordinate and y-coordinate data from csv

#x,Store y data in another array
x = np.array(1) #Prepare a numpy array
y = np.array(1) #At this time, there is data that is not needed at the beginning
for data in data_set:
    x = np.append(x, data[0]) #Add data with append
    y = np.append(y, data[1])
x = np.delete(x, 0,0) #Delete unnecessary data
y = np.delete(y, 0,0)

2 Put x and y taken in 1 into the model that performs linear regression. 3 Make a straight line by making a prediction with the model created in 2. 4 Draw with matplotlib

What is scikit-learn

It is a module that performs regression and classification (Zackri) Click here for details → Official page

Code # using a linear regression model

#Import module for linear regression
from sklearn.linear_model import LinearRegression

#In numpy linspace for the x coordinate of the regression line-Equally prepare 100 values from 10 to 40
line_x = np.linspace(-10, 40, 100)

#scikit-Find the prediction formula with the least squares model in learn
model = LinearRegression()
model = model.fit(x.reshape(-1,1), y.reshape(-1,1)) #Put the data in the model
model_y = model.predict(line_x.reshape(-1,1)) #Forecast
plt.plot(line_x, model_y, color = 'yellow')

model = model.fit (x.reshape (-1,1), y.reshape (-1,1)), but the shape of the numpy array is changed to match the argument of the function For more information here

Here is the completed code and diagram

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

data_set = np.loadtxt(
    fname="sampleData.csv",
    dtype="float",
    delimiter=",",
)
#x,Store y data in another array
x = np.array(1)
y = np.array(1)
for data in data_set:
    x = np.append(x, data[0])
    y = np.append(y, data[1])
x = np.delete(x, 0,0)
y = np.delete(y, 0,0)


#Draw a scatter plot
for data in data_set:
    plt.scatter(data[0], data[1])


#scikit-Find the prediction formula with the least squares model in learn
model = LinearRegression()
model = model.fit(x.reshape(-1,1), y.reshape(-1,1))
line_x = np.linspace(-10, 40, 100)
model_y = model.predict(line_x.reshape(-1,1))
plt.plot(line_x, model_y, color = 'yellow')

plt.title("correlation")
plt.xlabel("Average Temperature of SAITAMA")
plt.ylabel("Average Temperature of IWATE")
plt.grid()

plt.show()

lineReg_scikit.png

Thank you for your hard work

Recommended Posts

[Python] Linear regression with scikit-learn
Robust linear regression with scikit-learn
Linear regression in Python (statmodels, scikit-learn, PyMC3)
Linear regression with statsmodels
Regression with linear model
Neural network with Python (scikit-learn)
Online linear regression in Python
Python Scikit-learn Linear Regression Analysis Nonlinear Simple Regression Analysis Machine Learning
Linear regression
Logistic regression analysis Self-made with python
FizzBuzz with Python3
Isomap with Scikit-learn
Scraping with Python
Algorithm learned with Python 9th: Linear search
Scraping with Python
Python with Go
Online Linear Regression in Python (Robust Estimate)
Machine learning with python (2) Simple regression analysis
Twilio with Python
Integrate with Python
[Python] Use string data with scikit-learn SVM
Play with 2016-Python
AES256 with python
Tested with Python
python starts with ()
DBSCAN with scikit-learn
Clustering with scikit-learn (1)
with syntax (Python)
Introduction to Bayesian Statistical Modeling with python ~ Trying Linear Regression with MCMC ~
Clustering with scikit-learn (2)
PCA with Scikit-learn
Bingo with python
Zundokokiyoshi with python
kmeans ++ with scikit-learn
Excel with Python
Microcomputer with Python
Cast with python
Implemented in Python PRML Chapter 3 Bayesian Linear Regression
Let's solve simultaneous linear equations with Python sympy!
Predict hot summers with a linear regression model
EV3 x Python Machine Learning Part 2 Linear Regression
Introduction to Generalized Linear Models (GLM) with Python
Multivariable regression model with scikit-learn --SVR comparison verification
Easy Lasso regression analysis with Python (no theory)
Serial communication with Python
Zip, unzip with python
Django 1.11 started with Python3.6
Python with eclipse + PyDev.
Socket communication with Python
Data analysis with python 2
Scraping with Python (preparation)
Cross Validation with scikit-learn
Try scraping with Python.
Learning Python with ChemTHEATER 03
Sequential search with Python
"Object-oriented" learning with python
Handling yaml with python
Solve AtCoder 167 with python
Serial communication with python
[Python] Use JSON with Python
Learning Python with ChemTHEATER 05-1