# "Data Science 100 Knock (Structured Data Processing)" Python-004 Explanation

Youtube Video commentary is also available.

## problem

P-004: Specify the columns in the order of sales date (sales_ymd), customer ID (customer_id), product code (product_cd), sales amount (amount) from the receipt statement data frame (df_receipt), and select the data that meets the following conditions. Extract. --Customer ID (customer_id) is "CS018205000001"

#### `code`

``````
df_receipt[['sales_ymd', 'customer_id', 'product_cd', 'amount']] \
.query('customer_id == "CS018205000001"')
``````

#### `output`

``````
sales_ymd	customer_id	    product_cd	amount
36	    20180911	CS018205000001	P071401012	2200
9843	20180414	CS018205000001	P060104007	600
21110	20170614	CS018205000001	P050206001	990
27673	20170614	CS018205000001	P060702015	108
27840	20190216	CS018205000001	P071005024	102
28757	20180414	CS018205000001	P071101002	278
39256	20190226	CS018205000001	P070902035	168
58121	20190924	CS018205000001	P060805001	495
68117	20190226	CS018205000001	P071401020	2200
72254	20180911	CS018205000001	P071401005	1100
88508	20190216	CS018205000001	P040101002	218
91525	20190924	CS018205000001	P091503001	280
``````

## Commentary

**-In Pandas DataFrame / Series, it is a method to check the specified row while specifying the column. -Use when you want to narrow down the column information and specify the row to check. ·' [['','','']]. With "')', among the specified columns (column name A, column name B, column name C), the row corresponding to the row information A specified by column name A is displayed. ・ By the way, "==" means that they match. Please note that "=" represents an assignment **

• Please refer to here for Python comparison operators.