[PYTHON] How to extract conditions (acquire all elements of Group that satisfy the conditions) for Group by Group

I want to extract conditions for a group that has been grouped.

I will explain how to get all the elements of the group that satisfy the conditions from the group by group in Python.

For example, if you have the following data and the highest score is 80 or more, you aim to acquire all the data of that person.

import pandas as pd
import numpy as np

df = pd.DataFrame({"name":["Yamada","Yamada","Yamada","Suzuki","Suzuki","Hayashi"],
                   "score":[60,70,80,60,70,80]})
print(df)

#       name  score
# 0   Yamada     60
# 1   Yamada     70
# 2   Yamada     80
# 3   Suzuki     60
# 4   Suzuki     70
# 5  Hayashi     80

(Corrected on 19/12/05) In such a case, you can write in one line by using `` `groupby.filter```.

new_df = df.groupby('name').filter(lambda group: group['score'].max() >= 80)
print(new_df)

#       name  score
# 0   Yamada     60
# 1   Yamada     70
# 2   Yamada     80
# 5  Hayashi     80

The content of `filter ()` is a lambda expression for the condition.

By the way, before I was taught by Qiita, I used to extract conditions as follows. You can get a key that meets the conditions for each group that has groupby, and then join the original data frame to that key on the left. Specifically, the code is as follows.

group_df = df.groupby('name').max().reset_index()
key = group_df[group_df['score'] >= 80]['name']
new_df = pd.merge(key, df, on = 'name', how = 'left')
print(new_df)

#       name  score
# 0  Hayashi     80
# 1   Yamada     60
# 2   Yamada     70
# 3   Yamada     80

I was impressed to be able to write a series of flow of left outer join in one line in order to retrieve the key that satisfies the condition and restore the score information that was deleted by the groupby operation.

Recommended Posts

How to extract conditions (acquire all elements of Group that satisfy the conditions) for Group by Group
[Python] How to use the for statement. A method of extracting by specifying a range or conditions.
Sort the elements of the array by specifying the conditions
How to create a property of relations that can be prefetch_related by specific conditions
How to test the attributes added by add_request_method of pyramid
How to change the log level of Azure SDK for Python
How to use machine learning for work? 01_ Understand the purpose of machine learning
How to create a wrapper that preserves the signature of the function to wrap
Pandas of the beginner, by the beginner, for the beginner [Python]
How to mention a user group in slack notification, how to check the id of the user group
Output all the email body of the email group searched by Gmail and narrowed down
How to read all the classes contained in * .py in the directory specified by Python
[python] How to sort by the Nth Mth element of a multidimensional array
How to find the coefficient of the trendline that passes through the vertices in Python
Use numpy to remove columns or rows that contain elements of certain conditions
How to make a Raspberry Pi that speaks the tweets of the specified user
[Introduction to Python] How to get the index of data with a for statement
How to calculate the volatility of a brand
[python] How to display list elements side by side
How to specify the launch browser for JupyterLab 3.0.0
[python] Check the elements of the list all, any
How to use MkDocs for the first time
How to erase the characters output by Python
How to get dictionary type elements of Python 2.7
How to find the correlation for categorical variables
How to avoid the cut-off label of the graph created by the plot module using matplotlib
[Ruby] How to replace only a part of the string matched by the regular expression?
Convert financial information of all listed companies for the past 5 years to CSV file
How to know the port number of the xinetd service
How to get the number of digits in Python
How to solve the recursive function that solved abc115-D
How to visualize the decision tree model of scikit-learn
How to check the Java version used by Maven
[Blender] How to dynamically set the selection of EnumProperty
[Python] Outputs all combinations of elements in the list
[Python] Summary of how to specify the color of the figure
How to hit the document of Magic Function (Line Magic)
How to access the global variable of the imported module
Group by consecutive elements of a list in Python
Compare how to write processing for lists by language
[Selenium] How to specify the relative path of chromedriver?
Extract only elements that meet specific conditions in Python
How to unprefix the DB name used by pytest-django
How to check if the contents of the dictionary are the same in Python by hash value
How to solve the problem that video content cannot be played on Firefox for Linux
How to plot a lot of legends by changing the color of the graph continuously with matplotlib
How to find the cumulative sum / sum for each group using DataFrame in Spark [Python version]
How to set variables that can be used throughout the Django app-useful for templates, etc.-