# "Data Science 100 Knock (Structured Data Processing)" Python-006 Explanation

Youtube Video commentary is also available.

## problem

P-006: From the receipt detail data frame "df_receipt", specify the columns in the order of sales date (sales_ymd), customer ID (customer_id), product code (product_cd), sales quantity (quantity), sales amount (amount), and the following Extract data that meets the conditions. --Customer ID (customer_id) is "CS018205000001" --Sales amount (amount) is 1,000 or more or sales quantity (quantity) is 5 or more

#### `code`

``````
df_receipt[['sales_ymd', 'customer_id', 'product_cd', 'quantity', 'amount']] \
.query('customer_id == "CS018205000001" & (amount >= 1000 or quantity >=5)')
``````

#### `output`

``````
sales_ymd customer_id     product_cd  quantity  amount
36     20180911  CS018205000001  P071401012  1         2200
9843   20180414  CS018205000001  P060104007  6         600
21110  20170614  CS018205000001  P050206001  5         990
68117  20190226  CS018205000001  P071401020  1         2200
72254  20180911  CS018205000001  P071401005  1         1100
``````

## Commentary

**-In Pandas DataFrame / Series, it is a method to check the rows that meet multiple conditions among the specified rows while specifying the columns. -Use this when you want to narrow down the column information, specify the row, and check the information that meets multiple conditions. -The or condition is expressed using the "|" pipeline (vertical bar). ยท' [['','','']]] .guery (' == " "& > = 1000 | Column name C> = 5')', among the specified columns (column name A, column name B, column name C)," row information specified by column name A Displays the row that corresponds to A and has column name B of 1000 or more, or the row that corresponds to row information A specified by column name A and has column name C of 5 or more. .. ** **

** * By the way, even if "|" is changed to "or" as shown in the code below, the same result will be obtained. ** ** df_receipt[['sales_ymd', 'customer_id', 'product_cd', 'quantity', 'amount']]
.query('customer_id == "CS018205000001" & (amount >= 1000 or quantity >=5)')