Introduction

This is my first post. I will summarize the points I was interested in in "Pattern Recognition and Machine Learning" (PRML) that I am currently reading. (Chapter 2 2.1 (p66 ~))

What is Bernoulli distribution?
What is maximum likelihood estimation method?
Disadvantages of maximum likelihood estimation method

1. What is Bernoulli distribution?

Let's start with the definition. When the random variable $ X $ follows a Bernoulli distribution with mean $ u $

P(x=1|u)=u,P(x=0|u)=1-u

Meet. Put the two together

P(x|u)=u^x (1-u)^{1-x}

You can also write.

A simple example is a coin with a $ u $ probability of appearing ($ x = 1 $). The following topics will also use coins as an example.

2. What is maximum likelihood estimation method?

How to estimate the average $ u $ from a given sample. With maximum likelihood estimation $ N $ samples

x_1,x_2...x_n

Given, the likelihood function $ L $ defined below

L(u) = \prod_{i=0}^n u^{x_i}(1-u)^{1-x_i}

Let $ u_ {ML} $ be the maximum estimator for the true mean $ u $.

Let's find $ u $ that actually maximizes the likelihood function $ L $. First, to simplify the equation, we take the logarithm of the likelihood function $ L $.

log(L(u)) = \sum_{i=0}^N x_i log(u) + (1-x_i)log(1-u)

If $ u $ that maximizes $ log (L (u)) $ is $ u_ {ML} $

u_{ML} = \frac{1}{N} \sum_{i=0}^N x_i

This is when $ x = 1 $ is $ m $ in $ N $ trials.

u_{ML} = m

It means that

Let's try the maximum likelihood estimation method using the coin example. Now suppose you want to know the probability that a coin will appear on the table. For the time being, when I threw it about 10 times, the following results were obtained.

Table ・ ・ ・ 3 times
Behind ... 7 times

Follow the above method to find $ u_ {ML} $ that maximizes the likelihood function.

u_{ML} = \frac{1}{N} \sum_{i=0}^N x_i \\
 = \frac{1}{10} \sum_{i=0}^{10} x_i \\
= \frac{3}{7}

Therefore, it was possible to estimate that "the probability that this coin will appear is $ \ frac {3} {7} $".

3. Disadvantages of maximum likelihood estimation method

In the previous section, we found that the output of the maximum likelihood estimation method in the Bernoulli distribution depends on the number of times an event occurred in the trial. The drawback of the maximum likelihood estimation method is that when a coin is tossed three times and all the coins appear, it is estimated that "the probability that this coin will appear is 1". In other words, a small number of trials will cause overfitting.

[PYTHON] Advantages and disadvantages of maximum likelihood estimation

Introduction

table of contents

1. What is Bernoulli distribution?

2. What is maximum likelihood estimation method?

3. Disadvantages of maximum likelihood estimation method