(Python) Expected value ・ I tried to understand Monte Carlo sampling carefully

Introduction

I have reviewed the expectations for studying Bayesian statistics.

I referred to the following book.

What is the expected value?

Expected value is the mean value of $ f (x) $ under the probability distribution $ p (x) $ of a function $ f (x) $. As a notation, write $ E [f] $.

It is expressed as follows in the discrete distribution.


E[f] = \sum_x p(x)f(x)

On the other hand, continuous variables can be expressed as integrals.


E[f] = \int p(x)f(x)dx

Entropy

The following expected value for the probability distribution $ p (x) $ is called entropy.

\begin{align}
H[p(x)]& = - \sum_x p(x) ln(p(x))\\

\end{align}

Approximation with finite sum (Monte Carlo sampling)

When the set of samples extracted independently from the distribution $ p (x) $ is $ \ bf {z} ^ {(n)} (n = 1, ..., N) $, the expected value is as follows. Can be approximated as


E[f] = \frac{1}{L} \sum_{n=1}^{N}f(\bf{z}^{(N)})

It will be.

Let's consider an example here. example Consider a discrete distribution such that $ p (x = 1) = 0.3 and p (x = 1) = 0.7 $.

From the definition of entropy, entropy is

\begin{align}
H[p(x)]& = - \sum_x p(x) ln(p(x))\\
&=-(p(x=1)lnp(x=1) + p(x=0)lnp(x=0) )\\
&= -(\frac{3}{10}ln\frac{3}{10}+\frac{7}{10}ln\frac{7}{10})\\
&=0.610
\end{align}

It will be. Now, let's calculate when approximating this with a finite sum. The random.uniform method outputs a random value between 0 and 1 and $ x = 1 $ or $ x = 2 $ depending on whether it is greater than $ p (x = 1) = 0.3 $. I am trying to determine if it is. And the number of times that $ x = 1,2 $ is counted by cnt.   The program will be as follows. I have calculated 1000 times as a trial.

cnt = []
proba_1  =[]
proba_2  =[]
time = 1000
a = random.uniform(0,1)
exp =[]

for i in range(time):
    a = random.uniform(0,1)
    if a > p1:
        cnt = np.append(cnt,1)
    else:
        cnt = np.append(cnt, 0)
    proba_1 = np.append(proba_1, (i+1-sum(cnt))/(i+1))
    proba_2 = np.append(proba_2, sum(cnt)/(i+1))
    exp = np.append(exp, -(((i+1-sum(cnt))*math.log(p1))+((sum(cnt))*math.log(p2)))/(i+1))
 
plt.xlabel('time')
plt.ylabel('probability')
plt.plot(time_plot, proba_2, label="p2")
plt.plot(time_plot, proba_1, label="p1")
plt.legend()       

002.png

It was found that it converges to $ p (x = 1) = 0.3 and p (x = 2) = 0.7 $ after about 100 times.

003.png

It was found that this also converged to around 0.61 of the expected value (= entropy) originally obtained after 100 times.

It was confirmed that there is no problem with this expected value approximation method.

At the end

This time it was a very simple example, so it was easy to calculate and confirm. However, in actual problems, it is often difficult to obtain the expected value analytically. Therefore, I think it is useful to remember to approximate with this Monte Carlo sampling.

The full text of the program is here. https://github.com/Fumio-eisan/VI20200520

Recommended Posts

(Python) Expected value ・ I tried to understand Monte Carlo sampling carefully
I tried to touch Python (installation)
I tried to implement permutation in Python
I tried to implement PLSA in Python 2
Python3 standard input I tried to summarize
I tried to implement ADALINE in Python
I tried to implement PPO in Python
[Python] I tried to calculate TF-IDF steadily
I tried to touch Python (basic syntax)
I tried to implement Bayesian linear regression by Gibbs sampling in python
(Machine learning) I tried to understand Bayesian linear regression carefully with implementation.
I tried to understand the decision tree (CART) that makes the classification carefully
#Monte Carlo method to find pi using Python
I tried to get CloudWatch data with Python
I tried to output LLVM IR with Python
I tried to implement TOPIC MODEL in Python
I tried to automate sushi making with python
I tried to implement selection sort in python
[Python & SQLite] I tried to analyze the expected value of a race with horses in the 1x win range ①
I tried to graph the packages installed in Python
Even beginners want to say "I fully understand Python"
When I tried to introduce python3 to atom, I got stuck
I tried to summarize how to use matplotlib of python
I tried to implement Minesweeper on terminal with python
I tried to get started with blender python script_Part 01
I tried to touch the CSV file with Python
I tried to draw a route map with Python
I tried Python> autopep8
I tried to solve the soma cube with python
I tried to implement a pseudo pachislot in Python
Continuation ・ I tried to make Slackbot after studying Python3
I tried to get started with blender python script_Part 02
I tried to implement Dragon Quest poker in Python
I tried to implement an artificial perceptron with python
I tried to debug.
I tried to implement GA (genetic algorithm) in Python
[Python] I tried to graph the top 10 eyeshadow rankings
I tried to automatically generate a password with Python3
I tried to paste
I tried to summarize how to use pandas in python
I tried to solve the problem with Python Vol.1
I tried to analyze J League data with Python
I tried Python> decorator
[Python] I tried to get Json of squid ring 2
I tried to access Google Spread Sheets using Python
I tried to summarize the string operations of Python
I tried to solve AOJ's number theory with Python
I tried to create a Python script to get the value of a cell in Microsoft Excel
I tried to find the entropy of the image with python
I want to initialize if the value is empty (python)
I tried to simulate how the infection spreads with Python
I tried to create API list.csv in Python from swagger.yaml
I tried to make various "dummy data" with Python faker
I tried various methods to send Japanese mail with Python
I tried to implement a one-dimensional cellular automaton in Python
I tried LeetCode every day 13. Roman to Integer (Python, Go)
[Markov chain] I tried to read negative emotions into Python.
[Markov chain] I tried to read a quote into Python.
I tried "How to get a method decorated in Python"
[Python] I tried to visualize tweets about Corona with WordCloud
[Python] I tried to visualize the follow relationship of Twitter