TL;DR
ʻUnstack` was useful for making the multiIndex series easier to see.
DataFrame of pandas will have index MultiIndex if you do group by
in multiple columns. I'm a little clogged up to process, so I'll write what I did as a memorandum.
here,
I'm running on.
For example, if you have the following data:
import datetime
import random
import pandas as pd
item_list = ['A', 'A', 'A', 'B', 'C','C', 'D']
data_records = []
ts = datetime.datetime.now()
for _ in range(1000):
ts += datetime.timedelta(seconds=random.randint(200, 3600))
data_records.append({
'ts': ts,
'wday': ts.weekday(),
'item': random.choice(item_list),
'qty': random.randint(1, 5)
})
df = pd.DataFrame(data_records)
As df
You should get something like this.
here,
Imagine something like a log of an EC site.
Now suppose you want to see how many items sell in total for each day of the week. Actually, it is normal to specify the period with ts
, but aside from that, I think that you will do the following.
df.groupby(['wday', 'item']).qty.sum()
Then you will get something like this: It's not bad, but it's also hard to see. Here, if you do ʻunstack`,
df.groupby(['wday', 'item']).qty.sum().unstack()
have become.
For more information, see Pandas Official Documentation.
Recommended Posts