[PYTHON] Manipulating strings with pandas group by

Overview

You can often find examples of getting the average, minimum, and maximum values in pandas, I often created groups and processed them, so I summarized my memorandum instead. ~~ I feel like I'm going to get stuck in what number of decoctions ... ~~

Things necessary

I am using Jupyter Notebook to check the operation.

Processing content

The data used is the data of adverse events of JADER.


import pandas as pd
import numpy as np
reacs=pd.read_csv('reac.csv',dtype='str',encoding='shift-jisx0213')

First, group by ** identification number ** so that each case is unique

groupCaseNo=reacs.groupby('Identification number')

Since it is grouped by identification number, you can get the grouped keys by using groups as shown below.

groupCaseNo.groups.keys()

Processing can be performed for each key by doing the following. The contents of get_group can be obtained by using the grouping key.

for case in groupCaseNo.groups.keys():
    print(groupCaseNo.get_group(case))

It is possible to combine strings using a function by using ʻapply` as shown below. Anonymous functions are possible using lambda, but I think you'll have to create a separate function when doing complicated things.

def getRecordAe(data):
    return data.Harmful event serial number+':'+data.Adverse event

groupCaseNo.apply(getRecordAe)

Recommended Posts

Manipulating strings with pandas group by
Standardize by group with pandas
Feature generation with pandas group by
Create an age group with pandas
Sort by pandas
When to_csv with Pandas, it became line by line
Draw a graph by processing with Pandas groupby
Quickly visualize with Pandas
Processing datasets with pandas (1)
Bootstrap sampling with Pandas
Convert 202003 to 2020-03 with pandas
Processing datasets with pandas (2)
Merge datasets with pandas
Extract N samples for each group with Pandas DataFrame
Data manipulation with Pandas!
Shuffle data with pandas
pandas Matplotlib Summary by usage
Load nested json with pandas
Memorandum (pseudo Vlookup by pandas)
Stick strings together with Numpy
Manipulating mongoDB with Python-Part 6: aggregate-
[Python] Change dtype with pandas
Visualization memo by pandas, seaborn
How to separate strings with','
Prevent omissions with pandas print
Data processing tips with Pandas