[PYTHON] Let's try again Maximum likelihood estimation and fitting of model (probability distribution) ② Continuous probability distribution

Introduction

Previous content: Maximum likelihood estimation of discrete probability distribution and model fitting This time, we will apply the maximum likelihood estimation of the continuous probability distribution and the model (probability distribution).

Continuous probability distribution: normal distribution

The probability function that represents the normal distribution is as follows. $ P (y) = \ frac {1} {\ sqrt {2πσ ^ 2}} e ^ {(-\ frac {(y_i- \ mu) ^ 2} {2 \ sigma ^ 2})} (\ mu , \ sigma = parameter) $

The test data for estimating and fitting the model is created in advance from the probability distribution of the target so that the answers can be matched.

`Python`


"""Random numbers from the normal distribution"""
import numpy as np
import matplotlib.pyplot as plt

np.random.seed(seed=10)
norm_values = np.random.normal(0, 1, size=1000) #Average 0,1000 extracts from normal distribution with standard deviation 1
#drawing
plt.hist(norm_values, bins=50, range=[-5, 5], normed=True)
bin_edges

Since it is a continuous variable, it cannot be counted like a discrete type. Therefore, it is counted by dividing it into 0.2.

Now, let's apply a normal distribution to such data and estimate its parameters with maximum likelihood.

`Python`


"""Parameter estimation by partial differentiation of the log-likelihood function"""
import sympy
#Define variables (v=σ**2)
sympy.var('μ v y')
#Likelihood p(Parameters|x)Define
fe=(1/sympy.sqrt(2*sympy.pi*v))*sympy.exp(-(y-μ)**2/(2*v))
#Logarithmic
logf=sympy.log(fe)
#Partially differentiate f and expand the equation
pdff1 = sympy.expand(sympy.diff(logf, μ)) #Partial differential with respect to μ
pdff2 = sympy.expand(sympy.diff(logf, v)) #Partial differential with respect to v

The formula pdff1 that partially differentiates the log-likelihood with respect to μ is $ \ frac {y} {v}-\ frac {\ mu} {v} $, and the formula pdff2 that partially differentiates the log-likelihood with respect to v is $ \ frac {-1} {2v } + \ frac {y ^ 2} {2v ^ 2}-\ frac {y \ mu} {v ^ 2} + \ frac {\ mu ^ 2} {2v ^ 2} $ That's right. Since $ \ sum pdff1 = \ sum pdff2 = 0 $ μ, v can be found.

`Python`


def L_sympy(fmu,fs,var,values):       
    likelihood_mu = 0 #Initial value of likelihood
    likelihood_s = 0 #Initial value of likelihood
    for i in np.arange(len(values)):
        # likelihood
        likelihood_mu += fmu.subs(var,values[i]) #Assign a value to x
        likelihood_s += fs.subs(var,values[i]) #Assign a value to x
    param = sympy.solve([likelihood_mu,likelihood_s]) #Solve the equation
    return param

parameters = L_sympy(pdff1,pdff2,"y",norm_values)
parameters[0]["σ"]=sympy.sqrt(parameters[0][v])
parameters

[{v: 0.879764754284410, μ: -0.0145566356154705, 'σ': 0.937957757196138}]

It's almost the same as the average μ = 0 and σ = 1.

When the statistical model becomes complicated

For details, see the previous ①. Use the method of solving nonlinear equations.

`Python`


"""Probability density function"""
def probability_function(x,param):
    from scipy.special import factorial
    # param[0]s,param[1]mu:Parameters
    # y:Data y
    # return:Probability density P of data y(y|Parameters) 
    return (1/np.sqrt(2*np.pi*param[1]**2))*np.exp(-0.5*(y-param[0])**2/param[1]**2)

"""Log-likelihood function (continuous)"""
def L_func_c(param,y):       
    likelihood = 0 #Initial value of likelihood
    for i in np.arange(len(y)):
        # model output 
        p = probability_function(y[i], param)
        #likelihood likelihood
        likelihood += -np.log(p) #Likelihood
    return likelihood

"""Parameter estimation"""
"""
Quasi-Newton method (BFGS, L)-BFGS method: Memory can be saved in complicated cases)
"""
from scipy import optimize
x0 = [0,0.1] #Initial values of parameters
bound = [(-100, 100),(0, None)]#,(0,None)] #Scope of optimal parameter search(min,max)

params_MLE = optimize.minimize(L_func_c,x0,args=(norm_values),method='l-bfgs-b',
                        jac=None, bounds=bound, tol=None, callback=None,
                        options={'disp': None, 'maxls': 20, 'iprint': -1,
                                 'gtol': 1e-05, 'eps': 1e-08, 'maxiter': 15000,
                                 'ftol': 2.220446049250313e-09, 'maxcor': 10,
                                 'maxfun': 15000})
#Maximum likelihood estimation parameters
print('Parameter μ,σ：',params_MLE.x)
#Number of parameters
k=3
#AIC (the model with the smallest value is a good fit)
print('AIC：',params_MLE.fun*(2)+2*k)

Parameter μ,σ： [-0.01455655  0.93795781]
AIC： 2715.7763344850605

I want to know the parameters quickly ...

For details, see the previous ① . Use the method of solving nonlinear equations.

`Python`


"""scipy.stats.There is also a parameter estimation with fit"""
from scipy.stats import norm
fit_parameter = norm.fit(norm_values)
fit_parameter

#Parameter μ,σ
(-0.014556635615470447, 0.9379577571961389)

Apply to the data and try to illustrate

Let's apply the model (normal distribution) to the data with the estimated parameters μ = -0.01455655, σ = 0.93795781.

`Python`


acc_mle = probability_function(np.sort(norm_values), params_MLE.x)
plt.hist(norm_values, bins=50, range=[-5, 5], normed=True)
plt.plot(np.sort(norm_values), acc_mle)

It feels good.