[PYTHON] I want to convert vertically held data (long type) to horizontally held data (wide type)

A friend of mine who works for a medical company asked me if I wanted to convert a large amount of long data into wide data. In that industry, data such as patients are often represented in a wide format. So, I made a conversion tool in python, so I wrote it as an article.


Below, the conversion image

Before conversion

ID,data,item,
0,100,BPS,
0,200,LDL,
1,20,YBY,
2,XXX,YYY,
2,AAA,BBB,

After conversion

ID   YBY  BBB  BPS     LDL  YYY                              
0     N/A  N/A  100    200  N/A
1      20  N/A  N/A     N/A  N/A
2     N/A  AAA  N/A     N/A  XXX

So I created a tool using pandas. The created tool is as follows.

import pandas as pd
import csv

csv_file = open("sample.csv", "r",
                encoding="ms932", errors="", newline="")
f = csv.reader(csv_file, delimiter=",", doublequote=True,
               lineterminator="\r\n", quotechar='"', skipinitialspace=True)
columns = next(f)
data = []

for row in f:
    data.append(row)

df = pd.DataFrame(data, columns=columns)
df_t = df.pivot_table(values=[columns[1]], index=[columns[0]], columns=[
                      columns[2]], aggfunc='sum', fill_value='N/A')
df_t.to_excel('out.xlsx', index=True)

Read sample.csv, convert it to wide type data, and output it to Excel. As a premise, there are three columns in csv. Create a row for each ID, and the more items there are, the more columns will be added. Then, data is set for each value. If there is no data, N / A will be set.

Output result スクリーンショット 2020-07-05 13.26.34.png

If you want to convert CSV from long type to wide type, please refer to this implementation. The source is also posted on the following Github. https://github.com/kurihiro0119/transform_wide_long

Recommended Posts

I want to convert vertically held data (long type) to horizontally held data (wide type)
How to convert horizontally held data to vertically held data with pandas
numpy: I want to convert a single type ndarray to a structured array
Anyway, I want to check JSON data easily
I want to get League of Legends data ③
I want to create a plug-in type implementation
I want to get League of Legends data ①
I want to convert an image to WebP with lollipop
I convert AWS JSON data to CSV like this
I want to give a group_id to a pandas data frame
I want to say that there is data preprocessing ~
I want to be able to analyze data with Python (Part 3)
I want to be able to analyze data with Python (Part 1)
I want to be able to analyze data with Python (Part 2)
I want to convert an ISO-8601 character string to Japan time
I want to make the Dictionary type in the List unique
I want to solve Sudoku (Sudoku)
I want to acquire and list Japanese stock data without scraping
For the time being, I want to convert files with ffmpeg !!
I want to scrape images to learn
I want to do ○○ with Pandas
I want to copy yolo annotations
[Note] I want to completely preprocess the data of the Titanic issue-Age version-
I want to convert a table converted to PDF in Python back to CSV
I want to batch convert the result of "string" .split () in Python
I analyzed Airbnb data for those who want to stay in Amsterdam