[PYTHON] behavior of matplotlib: histogram normed

Histogram with matplotlib

A note of what I was addicted to when creating a histogram with matplotlib

pitfalls of normed

Note that matplotlib.axes.Axes.hist () or matplotlib.pyplot.hist (), which is used to create histograms with matplotlib, has a terrifying pitfall. When creating a histogram, we often add the option normed = 1 to normalize the frequency. However, despite this option, the histogram may look like the one below.

import numpy as np
import matplotlib.pyplot as plt
np.random.seed(0)
nor = np.random.normal(0,0.5,1000)
plt.hist(nor,normed=1,range=(-3,3),bins=300)
plt.savefig("test.pdf")

c0e82554-a3d9-11f4-b7c0-58dd5fe0b006.png

Eh, the y-axis scale value is greater than 1. ..

At first, I thought this was a bug and tried to force a histogram using barplot. But it's too much trouble. I thought it was a seaborn distroplot, but it didn't work because I had the same problem. When I thought it was a mystery, I realized that there was one.

import numpy as np
import matplotlib.pyplot as plt
np.random.seed(0)
nor = np.random.normal(0,0.5,1000)
plt.hist(nor,normed=1,range=(-3,3),bins=6)
plt.savefig("test.pdf")

5170b571-cc4a-88b6-b414-dec61b28d729.png

If the width of __bin is 1, it works normally. __ __ No way __ ,, How, __bin is normalized so that the total area of __bin is 1. Actually, if you look at the details of the data using numpy's histogram function (the behavior is the same as matplotlib's histogram)

import numpy as np
np.random.seed(0)
nor = np.random.normal(0,0.5,1000)
hist,pos = np.histogram(nor,normed=1,range=(-3,3),bins=300)
print(np.sum(hist))
#50.0
print(np.sum(hist)*6.0/300)
#1.0

Will be. Therefore, you can rewrite the code as follows.

import numpy as np
import matplotlib.pyplot as plt
np.random.seed(0)
nor = np.random.normal(0,0.5,1000)
binnum = 300
fig = plt.figure()
ax = plt.subplot(111)
ax.hist(nor,normed=1,range=(-3,3),bins=binnum)
ax_yticklocs = ax.yaxis.get_ticklocs()#Get scale information
ax_yticklocs = list(map(lambda x: x * len(range(-3,3))*1.0/binnum, ax_yticklocs))#Multiply the value of the original scale by the width of bin
ax.yaxis.set_ticklabels(list(map(lambda x: "%0.2f" % x, ax_yticklocs)))Show fixed scale
plt.savefig("test.pdf")

iTerm2.xULA2R.test copy.png

You should probably have the scale you want. However, this implementation is confusing ...

Recommended Posts

behavior of matplotlib: histogram normed
Behavior of multiprocessing.pool.Pool.map
Histogram with matplotlib
Visualize the behavior of the sorting algorithm with matplotlib
[python] behavior of argmax
Installation of matplotlib (Python 3.3.2)
Japanese display of matplotlib, seaborn
Histogram transparent overlay by Matplotlib
Change the style of matplotlib
Behavior of pandas rolling () method
Histogram parameter excerpt from matplotlib
Installation of SciPy and matplotlib (Python)
Add cumulative ratio to matplotlib histogram
About the return value of the histogram.
Installation of Python, SciPy, matplotlib (Windows)
Summary of go json conversion behavior
Behavior of python3 by Sakura's server
Unravel the mystery of matplotlib specgram
About the behavior of yield_per of SqlAlchemy
Show dividing lines in matplotlib histogram
About the size of matplotlib points
Exact behavior of diff --ignore-matching-lines = RE
Write a stacked histogram with matplotlib
Adjust the bin width crisply and neatly with the histogram of matplotlib and seaborn