I implemented Cousera's logistic regression in Python

Cousera --I implemented the logistic regression of Dr. Andrew Ng's Machine Learning Week3. I tried using only numpy as much as possible.

The customary magic.

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

Read the data and plot it.

def plotData(data):
    neg = data[:,2] == 0
    pos = data[:,2] == 1

    plt.scatter(data[pos][:,0], data[pos][:,1], marker='+', c='k', s=60, linewidth=2)
    plt.scatter(data[neg][:,0], data[neg][:,1], c='y', s=60)
    plt.xlabel('x')
    plt.ylabel('y')
    plt.legend(frameon= True, fancybox = True)
    plt.show()


data = np.loadtxt('ex2data1.txt', delimiter=',')
plotData(data)

https://diveintocode.gyazo.com/80f1b66e71c83968be637cc4fffb119b

Next, implement the cost function and sigmoid function

h_θ(x)=g(θ^Tx)
g(z)= \frac{1}{1+e^{-z}}

Sigmoid function

j(θ)= \frac{1}{m}(log(g(Xθ))^Ty+(log(1-g(Xθ))^T(1-y)))

Cost function

def sigmoid(z):
    return(1 / (1 + np.exp(-z)))

def CostFunction(theta, X, y):
    m = len(y)
    h = sigmoid(X.dot(theta))
    
    j = -1*(1/m)*(np.log(h).T.dot(y)+np.log(1-h).T.dot(1-y))
    
    return j
    
    

Let's look at the first cost.

X = np.c_[np.ones((data.shape[0],1)), data[:,0:2]]
y = np.c_[data[:,2]]

initial_theta = np.zeros(X.shape[1])
cost = CostFunction(initial_theta, X, y)
print(cost)

https://diveintocode.gyazo.com/b77db964bc493df3d34773a0de4d2f29

The cost is like this. Next is the implementation of the steepest descent method.

def gradient_decent (theta, X, y, alpha = 0.001, num_iters = 100000):
    m = len(y)
    history = np.zeros(num_iters)
    
    for inter in np.arange(num_iters):
        h = sigmoid(X.dot(theta))
        theta = theta - alpha *(1/m)*(X.T.dot(h-y))
        history[inter] = CostFunction(theta,X,y)

    return(theta, history)
    
initial_theta = np.zeros(X.shape[1])
theta = initial_theta.reshape(-1,1)
cost = CostFunction(initial_theta,X,y)
theta, Cost_h= gradient_decent(theta, X, y)

print(theta)
plt.plot(Cost_h)
plt.ylabel('Cost_h')
plt.xlabel('interation')
plt.show()

The result of executing it is as follows.

https://diveintocode.gyazo.com/e177f7b75ceffe2d461a017df96d856c

For some reason I get better results than setting num_iters to 10000000. Let's see the result.

def predict(theta, X, threshold = 0.5):
    p = sigmoid(X.dot(theta)) >= threshold
    
    return(p.astype('int'))

p = predict(theta,X)
y = y.astype('int')

accuracy_cnt = 0
for i in range(len(y)):
    if p[i,0] == y[i,0]:
        accuracy_cnt +=1

print(accuracy_cnt/len(y) * 100)
    

Running the above code will give you 91.0%. So what if you run num_inter at 1000000? ..

https://diveintocode.gyazo.com/0ee6bb360d9c5d4960db54189f677fa3

If you check this accuracy, it will be 89.0%. It is unclear why the above results are worse, despite the lower costs.

reference

https://github.com/JWarmenhoven/Coursera-Machine-Learning

Recommended Posts

I implemented Cousera's logistic regression in Python
Logistic distribution in Python
I implemented Python Logging
Implemented Shiritori in Python
Regression analysis in Python
I implemented the inverse gamma function in python
I implemented breadth-first search in python (queue, drawing self-made)
Multiple regression expressions in Python
Coursera Machine Learning Challenges in Python: ex2 (Logistic Regression)
Sudoku solver implemented in Python 3
Simple regression analysis in Python
I understand Python in Japanese!
Online linear regression in Python
What I learned in Python
6 Ball puzzle implemented in python
I implemented a Vim-like replacement command in Slackbot #Python
2. Multivariate analysis spelled out in Python 5-3. Logistic regression analysis (stats models)
I implemented Donald Knuth's unbiased sequential calculation algorithm in Python
Implemented image segmentation in python (Union-Find)
[Python] I implemented peripheral Gibbs sampling
Widrow-Hoff learning rules implemented in Python
I wrote Fizz Buzz in Python
Implemented label propagation method in Python
I learned about processes in Python
I can't install scikit-learn in Python
I wrote the queue in Python
Implemented Perceptron learning rules in Python
Logistic regression analysis Self-made with python
I tried Line notification in Python
Implemented in 1 minute! LINE Notify in Python
I wrote the stack in Python
Logistic regression
Logistic regression
I put Python 2.7 in Sakura VPS 1GB.
I tried to implement PLSA in Python
PRML Chapter 4 Bayesian Logistic Regression Python Implementation
Linear regression in Python (statmodels, scikit-learn, PyMC3)
A simple HTTP client implemented in Python
I tried to implement permutation in Python
I made a payroll program in Python!
Online Linear Regression in Python (Robust Estimate)
Implemented in Python PRML Chapter 7 Nonlinear SVM
I tried to implement PLSA in Python 2
I tried using Bayesian Optimization in Python
I can't debug python scripts in Eclipse
I tried to implement ADALINE in Python
I wanted to solve ABC159 in Python
I tried to implement PPO in Python
Implemented in Python PRML Chapter 5 Neural Networks
I searched for prime numbers in python
Implemented Stooge sort in Python3 (Bubble sort & Quicksort)
Implemented in Python PRML Chapter 1 Bayesian Inference
I created a password tool in Python.
CheckIO (Python)> Non-unique Elements> I implemented it
Implemented DQN in TensorFlow (I wanted to ...)
Why can't I install matplotlib in python! !!
I tried to implement Bayesian linear regression by Gibbs sampling in python
Quadtree in Python --2
Python in optimization
CURL in python
Metaprogramming in Python