"Introduction to Machine Learning by Bayesian Inference" Approximate inference of Poisson mixed model implemented only with Python numpy

I read through last month ["Introduction to Machine Learning by Bayesian Inference"](https://www.amazon.co.jp/ Machine Learning Startup Series-Introduction to Machine Learning by Bayesian Inference-KS Information Science Special Book-Suyama-Atsushi / dp / I implemented 4061538322) in Python.

The content is Chapter 4, "4.3 Inference in Poisson Mixed Model". BayesianInference.jpg A green book with a flower pattern. This is an introduction ... Machine learning is too deep. Lol

Data creation

This time, we will create a bimodal Poisson distribution as multimodal one-dimensional data.

import numpy as np
from matplotlib import pyplot as plt
%matplotlib inline

plt.figure(figsize=(12, 5))
plt.title('Poisson distribution', fontsize=20)

#Creating sample data
Lambda_x1 = 30
Lambda_x2 = 50
samples_x1 = 300
samples_x2 = 200
x1 = np.random.poisson(Lambda_x1, samples_x1)
x2 = np.random.poisson(Lambda_x2, samples_x2)

#Data combination
X = np.concatenate([x1, x2])

plt.hist(X, 30, density=True, label='all data')
plt.legend()
plt.show()

The result is as follows.

This is a complete coincidence, but this graph does not seem to be bimodal intuitively, so it is a perfect proof of the effectiveness of Bayesian inference!

Create a logumexp function

First, as a preparation, create a function that calculates logsumexp.

… And here, “logsumexp” hasn't appeared in the book, right? I thought you there! That's right. It doesn't come out. Lol

However, this "logsumexp" plays an important role in this algorithm!

First, this function is used in equation (4.38) in the book. スクリーンショット 2019-11-05 1.06.39.png Since there is a conditional expression of η in the lower row, logsumexp is required.

In fact, if you normally calculate η according to the formula in the upper row, η does not satisfy the conditional formula in the lower row, so you need to "normalize" it. Also, in this normalization, it is an operation to "align the total values to 1", so it is necessary to "divide each value by the total value". Therefore, the formula described above is transformed as follows. スクリーンショット 2019-11-05 1.07.05.png The second formula is exp⁡(logx) = x exp⁡(-logx) = -x It can be transformed from the formula of.

Also, the third stage is easy if you use the exponential law.

As a result, a term for normalization has been added at the end of the third row. If you look closely at this normalization term, it contains "log", "sum [= Σ]", and "exp [= η]"!

And when dealing with this "logsumexp", it seems that overflow or underflow may occur ...

Therefore, in order to prevent overflow and underflow in advance, we need a logsumexp function to be implemented in the future!

It's been long, but the function itself is easy, so let's implement it now.

def log_sum_exp(X):
    max_x = np.max(X, axis=1).reshape(-1, 1)
    return np.log(np.sum(np.exp(X - max_x), axis=1).reshape(-1, 1)) + max_x

I can't explain it any further, so please check the article "Mixed Gaussian distribution and logsumexp" for details.

Approximate inference of Poisson mixed model with Gibbs sampling

We will finally implement the algorithm from here!

This time, ["Introduction to Machine Learning by Bayesian Inference"](https://www.amazon.co.jp/ Machine Learning Startup Series-Introduction to Machine Learning by Bayesian Inference-KS Information Science Special Book-Suyama-Atsushi / dp / Implement "Algorithm 4.2 Gibbs Sampling for Poisson Mixed Model" described in 4061538322) based on Python numpy.

#Prepare a list for sampling
sample_s = []
sample_lambda = []
sample_pi = []

#Constant setting(Number of repetitions,The number of samples,Number of clusters)
MAXITER = 100
N = len(X)
K = 2

#Parameter initial value
init_Gam_param_a = 1
init_Gam_param_b = 1
init_Dir_alpha = np.ones(K)

# λ,Initial value setting of π
Lambda = np.ones(K)
Pi = np.ones(K)

#Normalized according to the condition of π(Divide each value by the total value)
if np.sum(Pi) != 1:
    Pi = Pi / np.sum(Pi)
    
#Repeated sampling for the number of MAXITER
for i in range(MAXITER):
    
    #Prepare the data base of s
    s = np.zeros((N, K))
    
    #Calculate η to sample s
    log_eta = np.dot(X.reshape(N, 1), np.log(Lambda.reshape(1, K))) - Lambda.reshape(1, K) + np.log(Pi.reshape(1, K))
    
    #Normalize η(Use logsumexp to prevent overflow and underflow)
    logsumexp_eta = -1 * log_sum_exp(log_eta)
    eta = np.exp(log_eta + logsumexp_eta)

    #Sample s from the category distribution with η as a parameter
    for n in range(N):
        s[n] = np.random.multinomial(1, eta[n])   
    #Add to sample list
    sample_s.append(np.copy(s))
    
    
    #a to sample λ,Calculate b
    Gam_param_a = (np.dot(s.T, X.reshape(N, 1)) + init_Gam_param_a).T[0]
    Gam_param_b = np.sum(s, axis=0).T + init_Gam_param_b
    
    # a, 1/Sample λ from the gamma distribution with b as a parameter
    Lambda = np.random.gamma(Gam_param_a, 1/Gam_param_b)
    #Add to sample list
    sample_lambda.append(np.copy(Lambda))
    
        
    #Calculate α to sample π
    Dir_alpha = np.sum(s, axis=0) + init_Dir_alpha
    
    #Sample π from Dirichlet distribution with α as a parameter
    Pi = np.random.dirichlet(Dir_alpha)
    #Add to sample list
    sample_pi.append(np.copy(Pi))
    
#Clusters in no particular order(Because the order is not defined)
sample_s_ndarray = np.array(sample_s)
sample_lambda_ndarray = np.array(sample_lambda)
sample_pi_ndarray = np.array(sample_pi)

It is basically implemented according to the book, but you need to be careful about the following points.

-Update each parameter η, a, b, α before sampling each of s, λ, π ・ Π needs to be normalized when it is the initial value (just divide each value by the total value) ・ Η is normalized by combining with logsumexp in np.exp () ・ In the book, the for statement of N and K is used, but if you do matrix calculation, the for statement is necessary only when sampling s. -Sampled values are added to each corresponding list each time

Also, the initial value of each parameter can be 1 or something. (Be careful of normalization only for π!)

Confirmation of sampling result

Let's check the results in order!

… And before that, note that the clusters are out of order. This time, the results were obtained by chance in the order of cluster settings, but the order of setting and the order of the obtained cluster results may not match. However, please be assured that there is no particular problem in that case as well.

Let's start with λ!

#Average value for each cluster
ave_Lambda = list(np.average(sample_lambda_ndarray, axis=0))

print(f'prediction: {ave_Lambda}')

The result is as follows. prediction: [29.921538459827033, 49.185569726045905]

λ corresponds to the average value of each cluster. When creating the data Lambda_x1 = 30 Lambda_x2 = 50 I set it, so it's pretty good accuracy!

Next, let's look at π.

#Percentage of cluster samples in all data
ave_Pi = list(np.average(sample_pi_ndarray, axis=0))

all_samples = samples_x1 + samples_x2

print(f'prediction: {ave_Pi}')

The results are as follows. prediction: [0.5801320180878531, 0.4198679819121469]

Regarding the number of data samples_x1 = 300 samples_x2 = 200 I set it. Dividing each value by the total number of data, 500, is [0.6, 0.4], so they are almost the same!

Finally, check s and finish! Thank you for your hard work.

#Number of samples in each cluster
sum_s = np.sum(np.sum(sample_s_ndarray, axis=0), axis=0) / MAXITER

print(f'prediction: {sum_s}')

Result is, prediction: [291.18 208.82] have become.

Since the number of samples obtained is doubled by the number of MAXITERs, it can be obtained by dividing the total sample of each cluster by MAXITER.

References / Reference Links

["Introduction to Machine Learning by Bayesian Inference"](https://www.amazon.co.jp/ Machine Learning Startup Series-Introduction to Machine Learning by Bayesian Inference-KS Information Science Special Book-Suyama-Atsushi / dp / 4061538322) "Mixed Gaussian distribution and logsumexp"

Recommended Posts

"Introduction to Machine Learning by Bayesian Inference" Approximate inference of Poisson mixed model implemented only with Python numpy
"Gaussian process and machine learning" Gaussian process regression implemented only with Python numpy
[Python] Easy introduction to machine learning with python (SVM)
Introduction to Python Basics of Machine Learning (Unsupervised Learning / Principal Component Analysis)
Python learning memo for machine learning by Chainer Chapter 10 Introduction to Cupy
Python learning memo for machine learning by Chainer Chapter 9 Introduction to scikit-learn
An introduction to Python for machine learning
REST API of model made with Python with Watson Machine Learning (CP4D edition)
A beginner of machine learning tried to predict Arima Kinen with python
Machine learning beginners tried to make a horse racing prediction model with python
Python learning notes for machine learning with Chainer Chapters 11 and 12 Introduction to Pandas Matplotlib
The first step of machine learning ~ For those who want to implement with python ~
Learn by running with new Python! Machine learning textbook Makoto Ito numpy / keras Attention!
[Chapter 5] Introduction to Python with 100 knocks of language processing
[Chapter 3] Introduction to Python with 100 knocks of language processing
Summary of the basic flow of machine learning with Python
Attempt to include machine learning model in python package
Variational Bayesian inference implementation of topic model in python
[Chapter 4] Introduction to Python with 100 knocks of language processing
Introduction to machine learning
Introduction to Machine Learning with scikit-learn-From data acquisition to parameter optimization
Try to evaluate the performance of machine learning / classification model
Create a python machine learning model relearning mechanism with mlflow
Machine learning to learn with Nogizaka46 and Keyakizaka46 Part 1 Introduction
[Raspi4; Introduction to Sound] Stable recording of sound input with python ♪
An introduction to machine learning
I tried to make Othello AI with tensorflow without understanding the theory of machine learning ~ Introduction ~
Implemented SMO with Python + NumPy
Machine learning with Python! Preparation
Chapter 1 Introduction to Python Cut out only the good points of deep learning made from scratch
[Python] Bayesian inference with Pyro
Beginning with Python machine learning
IPynb scoring system made with TA of Introduction to Programming (Python)
Machine learning with python without losing to categorical variables (dummy variable)
I read "Reinforcement Learning with Python: From Introduction to Practice" Chapter 1
Introduction to Bayesian Statistical Modeling with python ~ Trying Linear Regression with MCMC ~
A memorandum of scraping & machine learning [development technique] by Python (Chapter 4)
A memorandum of scraping & machine learning [development technique] by Python (Chapter 5)
Machine learning model management to avoid quarreling with the business side
I read "Reinforcement Learning with Python: From Introduction to Practice" Chapter 2
Introduction to Deep Learning for the first time (Chainer) Japanese character recognition Chapter 2 [Model generation by machine learning]
Introduction to machine learning Note writing
[Introduction to Python] <numpy ndarray> [edit: 2020/02/22]
Introduction to Machine Learning Library SHOGUN
[Python] Mixed Gauss model with Pyro
"Scraping & machine learning with Python" Learning memo
[Introduction to Python] How to sort the contents of a list efficiently with list sort
Source code of sound source separation (machine learning practice series) learned with Python
Take the free "Introduction to Python for Machine Learning" online until 4/27 application
(Machine learning) I tried to understand Bayesian linear regression carefully with implementation.
Python beginners publish web applications using machine learning [Part 2] Introduction to explosive Python !!
Python learning memo for machine learning by Chainer Chapter 13 Basics of neural networks
I tried to visualize the model with the low-code machine learning library "PyCaret"
Memorandum of means when you want to make machine learning with 50 images
[Introduction to Python] What is the method of repeating with the continue statement?
Python learning memo for machine learning by Chainer until the end of Chapter 2
I started machine learning with Python (I also started posting to Qiita) Data preparation
I'm an amateur on the 14th day of python, but I want to try machine learning with scikit-learn