[Python] 100 knocks on data science (structured data processing) 024 Explanation

Youtube Video commentary is also available.

problem

P-024: For the receipt detail data frame (df_receipt), find the newest sales date (sales_ymd) for each customer ID (customer_id) and display 10 items.

answer

code


df_receipt.groupby('customer_id').sales_ymd.max().reset_index().head(10)

output

customer_id sales_ymd
0 CS001113000004 20190308
1 CS001114000005 20190731
2 CS001115000010 20190405
3 CS001205000004 20190625
4 CS001205000006 20190224
5 CS001211000025 20190322
6 CS001212000027 20170127
7 CS001212000031 20180906
8 CS001212000046 20170811
9 CS001212000070 20191018

Commentary

**-Used when you want to process data with the same value collectively in Pandas DataFrame / Series and check the total or average of the data with the same value. -'Groupby' is used when you want to collect data with the same value or character string and perform a common operation (total or average) for each same value or character string. -'.Sales_ymd.max ()' displays the maximum value (= newest sales date) of'.sales_ymd'. -'.Reset_index ()' is used when you want to perform an operation to reassign the index numbers separated by'groupby' to serial numbers starting from 0. ** **

code


df_receipt.groupby('customer_id').agg({'sales_ymd':'max'}).reset_index().head(10)

Recommended Posts

[Python] 100 knocks on data science (structured data processing) 018 Explanation
[Python] 100 knocks on data science (structured data processing) 023 Explanation
[Python] 100 knocks on data science (structured data processing) 030 Explanation
[Python] 100 knocks on data science (structured data processing) 022 Explanation
[Python] 100 knocks on data science (structured data processing) 017 Explanation
[Python] 100 knocks on data science (structured data processing) 026 Explanation
[Python] 100 knocks on data science (structured data processing) 016 Explanation
[Python] 100 knocks on data science (structured data processing) 024 Explanation
[Python] 100 knocks on data science (structured data processing) 027 Explanation
[Python] 100 knocks on data science (structured data processing) 029 Explanation
[Python] 100 knocks on data science (structured data processing) 015 Explanation
[Python] 100 knocks on data science (structured data processing) 028 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-007 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-001 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-002 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 021 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-005 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-004 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 020 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 025 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-003 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 019 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 001-010 Impressions + Explanation Link Summary
Try "100 knocks on data science" ①
Getting started with Python with 100 knocks on language processing
Challenge 100 data science knocks
Data science 100 knock (structured data processing) environment construction (Windows10)
Data Science Cheat Sheet (Python)
[Python] Notes on data analysis
That's why I quit pandas [Data Science 100 Knock (Structured Data Processing) # 2]
That's why I quit pandas [Data Science 100 Knock (Structured Data Processing) # 1]
That's why I quit pandas [Data Science 100 Knock (Structured Data Processing) # 5]
That's why I quit pandas [Data Science 100 Knock (Structured Data Processing) # 4]
That's why I quit pandas [Data Science 100 Knock (Structured Data Processing) # 6]
Start data science on the cloud
Image processing with Python 100 knocks # 3 Binarization
Image processing with Python 100 knocks # 2 Grayscale
Image processing with Python 100 knocks # 8 Max pooling
I took Udemy's "Practical Python Data Science"
[Python] Various data processing using Numpy arrays
Image processing with Python 100 knocks # 7 Average pooling
Video processing using Python + OpenCV on Mac
Image processing with Python 100 knocks # 9 Gaussian filter
Books on data science to read in 2020
Periodically execute Python Script on AWS Data Pipeline
Folium: Visualize data on a map with Python
[Data science memorandum] Handling of missing values ​​[python]
Try importing MLB data on Mac and Python
TensorFlow: Run data learned in Python on Android
Run Python on Apache to view InfluxDB data
100 language processing knocks 03 ~ 05
100 language processing knocks (2020): 40
100 language processing knocks (2020): 32
[Python] Challenge 100 knocks! (015 ~ 019)
100 language processing knocks (2020): 35
python image processing
100 language processing knocks (2020): 47
100 language processing knocks (2020): 39
Python on Windows
twitter on python3
100 language processing knocks (2020): 22