[Python] 100 knocks on data science (structured data processing) 023 Explanation

Youtube Video commentary is also available.

problem

P-023: Sum the sales amount (amount) and sales quantity (quantity) for each store code (store_cd) for the receipt detail data frame (df_receipt).

answer

code


df_receipt.groupby('store_cd').agg({'amount':'sum', 'quantity':'sum'}).reset_index()

output

store_cd amount quantity
0 S12007 638761 2099
1 S12013 787513 2425
2 S12014 725167 2358
3 S12029 794741 2555
4 S12030 684402 2403
5 S13001 811936 2347
6 S13002 727821 2340
7 S13003 764294 2197
8 S13004 779373 2390
9 S13005 629876 2004
10 S13008 809288 2491
11 S13009 808870 2486
12 S13015 780873 2248
13 S13016 793773 2432
14 S13017 748221 2376
15 S13018 790535 2562
16 S13019 827833 2541
17 S13020 796383 2383
18 S13031 705968 2336
19 S13032 790501 2491
20 S13035 715869 2219
21 S13037 693087 2344
22 S13038 708884 2337
23 S13039 611888 1981
24 S13041 728266 2233
25 S13043 587895 1881
26 S13044 520764 1729
27 S13051 107452 354
28 S13052 100314 250
29 S14006 712839 2284
30 S14010 790361 2290
31 S14011 805724 2434
32 S14012 720600 2412
33 S14021 699511 2231
34 S14022 651328 2047
35 S14023 727630 2258
36 S14024 736323 2417
37 S14025 755581 2394
38 S14026 824537 2503
39 S14027 714550 2303
40 S14028 786145 2458
41 S14033 725318 2282
42 S14034 653681 2024
43 S14036 203694 635
44 S14040 701858 2233
45 S14042 534689 1935
46 S14045 458484 1398
47 S14046 412646 1354
48 S14047 338329 1041
49 S14048 234276 769
50 S14049 230808 788
51 S14050 167090 580

Commentary

** ・ It is a method to process data with the same value collectively in Pandas DataFrame / Series. -Use when you want to check the total or average of data with the same value. -'Groupby' is used when you want to collect data with the same value or character string and perform a common operation (total, average, etc.) for each same value or character string. -'Agg' is an abbreviation for Aggregation (meaning "aggregate"), and is used when you want to perform an operation such as finding a value for each group and creating a table. The sum is'sum', the mean is'mean', the maximum is'max' and the minimum is'min'. -'.Reset_index ()' is used when you want to perform an operation to reassign the index numbers separated by'groupby' to serial numbers starting from 0. ** **

Recommended Posts

[Python] 100 knocks on data science (structured data processing) 018 Explanation
[Python] 100 knocks on data science (structured data processing) 023 Explanation
[Python] 100 knocks on data science (structured data processing) 030 Explanation
[Python] 100 knocks on data science (structured data processing) 022 Explanation
[Python] 100 knocks on data science (structured data processing) 017 Explanation
[Python] 100 knocks on data science (structured data processing) 026 Explanation
[Python] 100 knocks on data science (structured data processing) 016 Explanation
[Python] 100 knocks on data science (structured data processing) 024 Explanation
[Python] 100 knocks on data science (structured data processing) 027 Explanation
[Python] 100 knocks on data science (structured data processing) 029 Explanation
[Python] 100 knocks on data science (structured data processing) 015 Explanation
[Python] 100 knocks on data science (structured data processing) 028 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-007 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-006 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-001 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-002 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 021 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-005 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-004 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 020 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 025 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-003 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 019 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 001-010 Impressions + Explanation Link Summary
Try "100 knocks on data science" ①
Getting started with Python with 100 knocks on language processing
Preparing to try "Data Science 100 Knock (Structured Data Processing)"
Challenge 100 data science knocks
Data science 100 knock (structured data processing) environment construction (Windows10)
Data Science Cheat Sheet (Python)
[Python] Notes on data analysis
That's why I quit pandas [Data Science 100 Knock (Structured Data Processing) # 2]
That's why I quit pandas [Data Science 100 Knock (Structured Data Processing) # 1]
That's why I quit pandas [Data Science 100 Knock (Structured Data Processing) # 3]
That's why I quit pandas [Data Science 100 Knock (Structured Data Processing) # 5]
That's why I quit pandas [Data Science 100 Knock (Structured Data Processing) # 4]
That's why I quit pandas [Data Science 100 Knock (Structured Data Processing) # 6]
Start data science on the cloud
Image processing with Python 100 knocks # 3 Binarization
Image processing with Python 100 knocks # 2 Grayscale
Image processing with Python 100 knocks # 8 Max pooling
I took Udemy's "Practical Python Data Science"
[Python] Various data processing using Numpy arrays
Image processing with Python 100 knocks # 7 Average pooling
Video processing using Python + OpenCV on Mac
Image processing with Python 100 knocks # 9 Gaussian filter
Books on data science to read in 2020
Periodically execute Python Script on AWS Data Pipeline
Folium: Visualize data on a map with Python
[Data science memorandum] Handling of missing values ​​[python]
Try importing MLB data on Mac and Python
TensorFlow: Run data learned in Python on Android
Run Python on Apache to view InfluxDB data
100 language processing knocks 03 ~ 05
100 language processing knocks (2020): 40
100 language processing knocks (2020): 32
[Python] Challenge 100 knocks! (015 ~ 019)
python image processing
100 language processing knocks (2020): 47
Python on Windows
twitter on python3