When I wanted to overlay the histogram and the probability density function, I couldn't overlay them neatly at first, so I'll leave a note of the solution at that time.
Random numbers generated by numpy.random have a histogram area larger than 1 depending on the number of generations.
--Standardized histogram --Align the area of the probability density function with the histogram
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
n = 2000 #The number of data
data = np.random.randn(n)
plt.hist(data, range=(-3, 3), bins=60, alpha=0.5, density=True)
#Class width 0.1 Total number of data 2000,Histogram with relative frequency density on the vertical axis
x = np.linspace(-3, 3, 61)
#0.Generate a sequence of numbers in 1 increments
plt.plot(x, norm.pdf(x), c='r')
The area was set to 1 by setting the vertical axis to the relative frequency density instead of the frequency.
Class width w,Class n,Frequency D_Original area S with n\\
S=\sum_{n}D_nw\\
Relative frequency is\\D_n^R=\frac{D_n}{\sum_{n}D_n}\\
Relative density frequency\\
\begin{aligned}
D_n^{'}&=\frac{D_n^R}{w}\\
&={\frac{D_n}{\sum_{n}D_n}} \times {\frac{1}{w}}\\
&=\frac{D_n}{S}\\
\end{aligned}\\
Area S of histogram using relative density frequency^{'}\\
\begin{aligned}
S^{'}&=D_n^{'}\\
&=\frac{\sum_{n}D_n}{S}\\
&=\frac{S}{S}\\
&=1
\end{aligned}
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
n = 2000 #The number of data
data = np.random.randn(n)
plt.hist(data, range=(-3, 3), bins=60, alpha=0.5)
#Class width 0.1 Total number of data 2000,Histogram with relative frequency density on the vertical axis
x = np.linspace(-3, 3, 61)
#0.Generate a sequence of numbers in 1 increments
plt.plot(x, n*0.1*norm.pdf(x), c='r')
#0.1 is class width
The area was adjusted by multiplying the probability density function by the area of the histogram.
Probability density function f(x),Total number of histogram data n,Let the class width be w.\\
f(x)Satisfies the following equation\\
\begin{aligned}
\int f(x) dx = 1
\end{aligned}\\
N on both sides\Histogram area n by multiplying times w\Can be times w.\\
Therefore, the probability density function f after conversion^{'}(x)Is\\
f^{'}(x)=n\times w \times f(x)
Recommended Posts