[Basics of modern mathematical statistics with python] Chapter 2: Probability distribution and expected value

Introduction

This series is a brief explanation of "Basics of Modern Mathematical Statistics" by Tatsuya Kubokawa, and let's implement the contents in python. I used Google Colaboratory (hereinafter referred to as Colab) for implementation. If you have any suggestions, I would appreciate it if you could write them in the comment section. It may not be suitable for those who want to understand all the contents of the book properly because it is written with a stance that it would be nice if it could be output by touching only the part that I thought needed explanation. Please note that if the formula numbers and proposition / definition indexes are written according to the book, the numbers may be skipped in this article.

Overview of Chapter 2

First, we strictly express the random variables that we use casually, and explain the probability distributions in the discrete and continuous types. It may be confusing to come up with similar words, but once you understand the content, you will not get lost. Next, the expected value is defined and the variance, standard deviation, etc. are explained. Probability generating functions, moment generating functions, and characteristic functions may be new to you, but they are important functions that will deepen your knowledge of statistics. I think it's okay to know only the idea of the last change of variables and do it each time you need it. I think that the first and second chapters are the preparations for the third and subsequent chapters, and even if they are not perfect at the moment, I could understand them while reading.

Random variable

$$ Random variables do not handle all the events that you are thinking of, but omit the parts that do not matter and make them easier to handle. For example, suppose you randomly select 100 people and ask if you like guppy. All events $ \ Omega $ consist of $ 2 ^ {100} $ elements, which is an individual distinction. But what I want to know now is how many out of 100 people like guppy. All events when individual is distinguished $ \ Omega $ (“People who answered like” is 1 and “People who answered dislike” is 0) and random variables when not distinguishing individuals are $ X $ All events (sample space) $ \ chi $ of $ X $ at that time are \Omega={$ {0,0,...,0},{1,0,...,0},...,{1,1,...,1} } $ \chi={0,1,2,...,100}$ $ You can see that the original number is orders of magnitude smaller and easier to handle. Random variable $ X $ is basically a variable that moves on a real line.

Probability distribution

Cumulative distribution function

Definition:

If the cumulative distribution function of the random variable $$ is $ F_X (x) , it can be expressed as $ F_X (x) = P (X \ leq x) $$.

Example: What is the probability of rolling the dice once and getting 4 or less? Answer: $ F_X (4) = P (X \ leq 4) = 4/6 = 2/3 $. about it. The cumulative distribution function is also simply called the distribution function. Random variables $ X $ when variables take discrete values like dice are called discrete random variables, and when variables take continuous values like temperature, they are called continuous random variables.

Probability function / probability density function

The $ $ cumulative distribution function $ F_X (x) $ considers the cumulative ($ X \ leq x $) probability, but then the (pinpoint) probability of $ X = x $.

** ・ Discrete type ** $ f_X (x) = P (X = x) $ is called a probability function. If you put a value in a variable, you can find the probability. For the discrete random variable $ X $, the probability function $ f_X (x) $

  f_X(x) = \left\{ \begin{array}{ll}
    p(x_i) & (x=x_When i) \\
    0 & (x \notin \When chi)
  \end{array} \right.

Can be expressed as. I have omitted the exact wording, but the characters used are the same as the meanings of the characters that have appeared so far. ** ・ Continuous type ** In the case of the continuous type, the probability cannot be calculated because it is not possible to consider only one variable. For example, even if you try to represent the real number 1 on the real line, it will continue infinitely as 1.0000000000 .... So, let's consider the probability that the variable has a little width instead of one point. Definition:

For the continuous random variable $ X , $ F_X (x) =
\int_{-\infty}^x f_X(t) dt, \ -\infty<x<\infty \tag{1} \ $$ When the function $ f_X (x) $ that becomes is present, $ f_X (x) $ is called ** probability density function **.

For example, what is the probability that tomorrow's temperature $ T [℃] $ will be $ 22 \ leq T \ leq25 $? It is a way of thinking. $ F_X (x) $ is the cumulative distribution function. I think the expression density will soon get used to it. Since it is a probability, of course, $ \ int_ {-\ infinty} ^ {\ infinty} f_X (x) dx = 1 \ tag {2} $. From equation (1), we can see that $ f_X (x) = \ frac {d} {dx} F_X (x) $. The probability density function converges at the limit of $ x \ to ± \ infinity $. This is because the cumulative distribution function, which is the integral value of the probability density function, converges to 1.

Expected value

First, from the definition of expected value:

The expected value of the function $ g (X) $ of the random variable $ X $ is represented by $ E [g (X)] $.

E[g(X)] = \left{ \begin{array}{ll} \int_{-\infty}^{\infty} g(x)f_X(x) dx& (When X is a continuous random variable) \ \sum_{x_i \in \chi}g(x_i)f_X(x_i) & (When X is a discrete random variable) \end{array} \right.

 It is expressed as.

 $ f_X (x) $ is the probability function mentioned above. In other words, you are adding up the product of the value of each variable $ x $ and the probability that that value will occur. The reason why the expected value is important is that the mean and variance, which are the characteristic values (reduced information) of the probability distribution, are also the expected values of the function $ g (X) $ of a random variable $ X $.

 ·average
 When $ g (X) = X $, the expected value of $ X $ $ E [X] $ is called the average of $ X $. It is expressed as $ E [X] = \ mu $. For translation and scale changes
$$E[aX+b]=aE[X]+b$$
 It will be.

 ・ Dispersion
 When $ g (X) = (XE [X]) ^ 2 $, the expected value $ E [(X- \ mu) ^ 2] $ is called the variance of $ X $, and $ V (X) $ or $ It is expressed as \ sigma ^ 2 $. $ \ sigma = \ sqrt {V (X)} $ is called the standard deviation of $ X $. Variance represents the degree of data dispersion, and the standard deviation is one that is easier to calculate by dropping one dimension. I will omit the proof, but the variance is for translation and scale change.
$$V[aX+b]=a^2V[X]$$
 It will be. Since the variance originally considers the square of the deviation (the difference between the mean value and each data), I think it makes sense. I think you can intuitively understand that even if the data is translated, the degree of scattering does not change.

 * Probability generating function, moment generating function, and characteristic function are likely to be long, so I will introduce them in one article at another time. As the name suggests, it is a function that can automatically obtain the probability function and product ratio.

# Let's run python
 Now let's use python to look at the probability density function and the cumulative distribution function of the standard normal distribution (which we will see in the next chapter).

```python
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import norm 

fig,ax = plt.subplots() 

x1 = np.arange(-5,5,0.1)
x2 = np.arange(-5,5,0.01)
y = (np.exp(-x2**2/2)) / np.sqrt(2*np.pi)
Y = norm.cdf(x1,loc=0,scale=1)#Cumulative distribution function of standard normal distribution(cumulative distribution function)Calculate

c1,c2 = "red","blue"

ax.set_xlabel("x")
ax.set_ylabel("plobability")
plt.grid(True)
plt.plot(x1,Y,color=c1,label=l1)
plt.plot(x2,y,color=c2,label=l2)
plt.show()

When you do this, it will look like the figure below image.png The blue graph is the standard normal distribution probability density function $ f_X (x) $, and the red graph is the cumulative distribution function $ F_X (x) $. You can see that the cumulative distribution function is approaching 0 to 1.

This is the end of Chapter 2. Thank you very much.

References

"Basics of Modern Mathematical Statistics" by Tatsuya Kubokawa

Recommended Posts

[Basics of modern mathematical statistics with python] Chapter 2: Probability distribution and expected value
[Basics of Modern Mathematical Statistics with python] Chapter 3: Typical Probability Distribution
[Basics of Modern Mathematical Statistics with python] Chapter 1: Probability
[Basics of modern mathematical statistics with python] Chapter 2: Probability distribution and expected value
Sequential calculation of mean value with online algorithm
Calculation of mutual information (continuous value) with numpy
1. Statistics learned with Python 2. Probability distribution [Thorough understanding of scipy.stats]
[Introduction to Data Scientists] Basics of Probability and Statistics ♬ Probability / Random Variables and Probability Distribution
1. Statistics learned with Python 2-1. Probability distribution [discrete variable]
[Python of Hikari-] Chapter 06-02 Function (argument and return value 1)
Getting Started with Python Basics of Python
Coexistence of Python2 and 3 with CircleCI (1.0)
Basics of binarized image processing with Python
"Manim" that can draw animation of mathematical formulas and graphs with Python
1. Statistics learned with Python 1-3. Calculation of various statistics (statistics)
Rehabilitation of Python and NLP skills starting with "100 Language Processing Knock 2015" (Chapter 1)
[Python] Chapter 02-04 Basics of Python Program (About Comments)
[Python] Chapter 02-03 Basics of Python programs (input / output)
Deep Learning from scratch The theory and implementation of deep learning learned with Python Chapter 3
List of main probability distributions used in machine learning and statistics and code in python
[Python of Hikari-] Chapter 05-06 Control Syntax (Basics of Comprehension)
1. Statistics learned with Python 1-2. Calculation of various statistics (Numpy)
Concept of Bayesian reasoning (2) ... Bayesian estimation and probability distribution
Implementation of TRIE tree with Python and LOUDS
[Python] Chapter 02-02 Basics of Python programs (Handling of character strings)
[Python for Hikari] Chapter 09-01 Classes (Basics of Objects)
[Python] Chapter 02-05 Basics of Python programs (string operations / methods)
Continuation of multi-platform development with Electron and Python
Example of reading and writing CSV with Python
Statistics with python
Basics of Python ①
Basics of python ①
Rehabilitation of Python and NLP skills starting with "100 Language Processing Knock 2015" (Chapter 2 first half)
[Python] Chapter 02-06 <Supplement> Basics of Python programs (handling of numerical values)
[Python for Hikari-] Chapter 06-04 Functions (arguments and return value 3)
Easy partial download of mp4 with python and youtube-dl!
[Chapter 5] Introduction to Python with 100 knocks of language processing
Visualize the range of interpolation and extrapolation with python
Understanding with mathematical formulas and Python LiNGAM (ICA version)
[Chapter 3] Introduction to Python with 100 knocks of language processing
[Chapter 2] Introduction to Python with 100 knocks of language processing
[Python for Hikari-] Chapter 06-03 Functions (arguments and return value 2)
[Basics of data science] Collecting data from RSS with python
Check the asymptotic nature of the probability distribution in Python
[Python] Chapter 01-02 About Python (Execution and installation of development environment)
Comparison of CoffeeScript with JavaScript, Python and Ruby grammar
Version control of Node, Ruby and Python with anyenv
[Chapter 4] Introduction to Python with 100 knocks of language processing
"Measurement Time Series Analysis of Economic and Finance Data" Solving Chapter End Problems with Python
Basics of Python scraping basics
Basics of python: Output
[Python of Hikari-] Chapter 08-03 Module (Import and use of standard library)
[Python of Hikari-] Chapter 05-10 Control syntax (interruption and continuation of iteration)
Get rid of dirty data with Python and regular expressions
Detect objects of a specific color and size with Python
[Introduction to Data Scientists] Basics of Python ♬ Functions and classes
Sample of HTTP GET and JSON parsing with python of pepper
[Golang] Basics of Go language About value receiver and pointer receiver
Play with the password mechanism of GitHub Webhook and Python