[PYTHON] How to unify the bin width when displaying multiple histograms on top of each other (matplotlib)

background

When displaying multiple histograms in an overlapping manner using a for loop, the width differs for each data and it was difficult to compare unless the bin width was specified, so I investigated how to display the bin width in a unified manner.

Import / used dataset

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_wine

wine = load_wine()
df_wine = pd.DataFrame(data=wine.data, columns=wine.feature_names)
df_wine['target'] = wine.target

Use the scikit-learn wine dataset. The target column has a label indicating the type of wine.

Method

If you pass a list to bins, which is an argument ofplt.hist (), a histogram will be drawn with the values specified in the list as interval delimiters. (If bins = [0,1,2,3,4], the bars in the four sections of 0 to 1, 1 to 2, 2 to 3, 3 to 4 are drawn.)

Using this, create a list with np.linspace (minimum value, maximum value, number you want to separate) and pass it as an argument ofplt.hist ()for each label to specify a common bin. To do.

↓ bin width not adjusted

feature_name = 'hue'
target_names = df_wine['target'].unique()

for target in target_names:
    plt.hist(df_wine[df_wine.target == target][feature_name], alpha=0.6, label=target)

plt.title(feature_name)
plt.legend()

raw_hist.png

↓ bin width adjustment available

feature_name = 'hue'
target_names = df_wine['target'].unique()

#N between the maximum and minimum values_Set to display the histogram bar with bin width divided equally (unify bin width of each target)
n_bin = 15
x_max = df_wine[feature_name].max()
x_min = df_wine[feature_name].min()
bins = np.linspace(x_min, x_max, n_bin)

for target in target_names:
    plt.hist(df_wine[df_wine.target == target][feature_name], bins=bins, alpha=0.6, label=target)

plt.title(feature_name)
plt.legend()

bin_adjusted_hist.png

Referenced articles

Adjust the bin width quickly and neatly with the histogram of matplotlib and seaborn --Qiita

Recommended Posts

How to unify the bin width when displaying multiple histograms on top of each other (matplotlib)
Precautions when drawing the probability density function and the histogram on top of each other in matplotlib
A story that I had a hard time displaying graphs on top of each other with matplotlib
How to assign multiple values to the Matplotlib colorbar
How to send a file in one shot by connecting to a host on the other side of the platform with SCP in multiple stages
How to handle multiple versions of CUDA in the same environment
[EC2] How to install chrome and the contents of each command
How to use Jupyter on the front end of supercomputer ITO
How to update the python version of Cloud Shell on GCP
How to run matplotlib on heroku
[Python] How to specify the window display position and size of matplotlib
Adjust the bin width crisply and neatly with the histogram of matplotlib and seaborn
Understand how to display images on Jupyter (utilization of imshow / matplotlib of OpenCV)
How to solve the problem that only the process remains when you press cross on the imshow screen of OpenCV
How to title multiple figures with matplotlib
How to solve the bin packing problem
How to build an environment for using multiple versions of Python on Mac
How to access the contents of a Linux disk on a Mac (but read-only)
Checklist on how to avoid turning the elements of numpy's array with for
How to add pre-save processing when adding objects on the Django admin site
How to make the font width of jupyter notebook put in pyenv equal width
[Flask + Keras] How to infer multiple models at high speed on the server