[PYTHON] The vertical and horizontal axes of the matplotlib histogram are unpleasant, so make it feel good

I often draw histograms with Python's matplotlib, but sometimes I don't like the vertical and horizontal axes, so this is a memo for fine-tuning it.

vertical and horizontal axes of hist

As an example, I will show you the following graph.

%matplotlib inline
import matplotlib.pyplot as plt
from scipy import stats

norm_rvs = stats.norm.rvs(loc=50, scale=30, size=100, random_state=0)
plt.hist(norm_rvs, bins=10, alpha=0.5, ec='navy')
plt.show()

matplotlib_histgram_2_0.png

Look at this

――Um, it feels bad because the breaks in the histogram bars are halfway! ――Um, it feels bad if the scale on the vertical axis is not an integer!

That's why.

I want to make the scale on the vertical axis an integer

You can get information about the bar breaks and heights in the histogram by doing the following:

Y, X, _ = plt.hist(norm_rvs, bins=10, alpha=0.5, ec='navy')
print(X)
print(Y)
plt.show()
[-26.58969448 -12.12146116   2.34677216  16.81500548  31.2832388
  45.75147212  60.21970544  74.68793876  89.15617208 103.6244054
 118.09263872]
[ 1.  5.  7. 13. 17. 18. 16. 11.  7.  5.]

matplotlib_histgram_4_1.png

Let's use that information to make the vertical axis an integer.

import numpy as np

Y, X, _ = plt.hist(norm_rvs, bins=10, alpha=0.5, ec='navy')
y_max = int(max(Y)) + 1
plt.yticks(np.arange(0, y_max, 2)) #It is hard to see even if it is in 1 increments, so make it in 2 increments.
plt.show()

matplotlib_histgram_6_0.png

I want to make the bar breaks look good

Specify the range on the horizontal axis and adjust the number of bins nicely.

Y, X, _ = plt.hist(norm_rvs, bins=13, alpha=0.5, ec='navy', range=(-10, 120))
print(X)
print(Y)
y_max = int(max(Y)) + 1
plt.yticks(np.arange(0, y_max, 2))
plt.show()
[-10.   0.  10.  20.  30.  40.  50.  60.  70.  80.  90. 100. 110. 120.]
[ 3.  5.  6. 10. 11.  9. 15. 13.  9.  6.  5.  5.  2.]

matplotlib_histgram_8_1.png

I want to make multiple histograms look good

Now, you may want to compare multiple histograms side by side.

norm_rvs2 = stats.norm.rvs(loc=75, scale=55, size=100, random_state=0)
plt.hist(norm_rvs, bins=10, alpha=0.5, ec='navy')
plt.hist(norm_rvs2, bins=10, alpha=0.5, ec='red')
plt.show()

matplotlib_histgram_11_0.png

It feels bad like this! It tends to be. Let's make this feel good as well.

bins = 20
range=(-50, 200)

Y1, X1, _ = plt.hist(norm_rvs, bins=bins, alpha=0.5, ec='navy', range=range)
Y2, X2, _ = plt.hist(norm_rvs2, bins=bins, alpha=0.5, ec='red', range=range)
y_max = int(max(max(Y1), max(Y2))) + 1
plt.yticks(np.arange(0, y_max, 2))
plt.show()

matplotlib_histgram_13_0.png

Personally, I prefer to arrange them vertically as follows.

bins = 20
range=(-50, 200)

fig, axes = plt.subplots(nrows=2, ncols=1, figsize=(8,8))
Y1, X1, _ = axes[0].hist(norm_rvs, bins=bins, alpha=0.5, ec='navy', range=range)
Y2, X2, _ = axes[1].hist(norm_rvs2, bins=bins, alpha=0.5, ec='red', range=range)
y_max = int(max(max(Y1), max(Y2))) + 1
axes[0].set_ylim([0, y_max])
axes[1].set_ylim([0, y_max])
axes[0].set_yticks(np.arange(0, y_max, 2))
axes[1].set_yticks(np.arange(0, y_max, 2))
plt.show()

matplotlib_histgram_15_0.png

That's all from the scene!

Recommended Posts

The vertical and horizontal axes of the matplotlib histogram are unpleasant, so make it feel good
Grid display of double plots (left and right vertical axes) (matplotlib)
Adjust the bin width crisply and neatly with the histogram of matplotlib and seaborn
Set the vertical axis of the histogram to relative frequency (total height of columns = 1) and relative frequency density (area of the entire histogram = 1) with matplotlib.
Use Pillow to make the image transparent and overlay only part of it
Make a histogram for the time being (matplotlib)
In matplotlib, set the vertical axis on the left side of the histogram to frequency and the vertical axis on the right side to relative frequency (maybe a wicked way)