[Python] Operation memo of pandas DataFrame

Introduction

Because I had the opportunity to analyze data even though I was a beginner So I will summarize the grammatical elements of the newly obtained Python DataFrame.

Premise

product.csv

id name price category isPopular
1 eraser 100 stationary 1
2 pencil 200 stationary 0
3 socks 400 clothes 1
4 pants 1000 clothes 0
5 apple 100 food 0

analyze.py


import pandas as pd

Extract the value type of a column

df['category'].value_counts().index

Execution result

Index(['stationery', 'clothes', 'food'], dtype='object')

Change / add the value of DataFrame by specifying the condition

df.loc[df.name == 'socks', 'price'] = 500
df.loc[df.category == 'stationery', 'category_id'] = 0
df.loc[df.category == 'clothes', 'category_id'] = 1
df.loc[df.category == 'food', 'category_id'] = 2
df

Execution result

id name price category isPopular category_id
1 eraser 100 stationary 1 0.0
2 pencil 200 stationary 0 0.0
3 socks 500 clothes 1 1.0
4 pants 1000 clothes 0 1.0
5 apple 100 food 0 2.0

Change to one-hot expression

#column isPopular and category_Extract only id (it will not work unless it is an integer value)
df_X = df.drop(['id','name','price','category'], axis=1)

from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder()
enc.fit(df_X)
onehot_array = enc.transform(df_X).toarray()
onehot_df = pd.DataFrame(onehot_array)
df = pd.concat([df_id, onehot_df], axis=1)
df

Execution result

id 0 1 2 3 4
1 0.0 1.0 1.0 0.0 0.0
2 1.0 0.0 1.0 0.0 0.0
3 0.0 1.0 0.0 1.0 0.0
4 1.0 0.0 0.0 1.0 0.0
5 1.0 0.0 0.0 0.0 1.0

Recommended Posts

[Python] Operation memo of pandas DataFrame
Basic operation of Python Pandas Series and Dataframe (1)
[Python] Operation of enumerate
Basic operation of pandas
Basic operation of Pandas
Python decorator operation memo
Python application: Pandas # 3: Dataframe
Automatic operation of Chrome with Python + Selenium + pandas
Formatted display of pandas DataFrame
The Power of Pandas: Python
Summary of pre-processing practices for Python beginners (Pandas dataframe)
[Python] Summary of table creation method using DataFrame (pandas)
Pandas memo
Python hand play (Pandas / DataFrame beginning)
Python memo
python memo
Python memo
Python3 compatible memo of "python start book"
python memo
[Memo] Small story of pandas, numpy
Python memo
Separate display of Python graphs (memo)
pandas memo
Operation memo of Conda virtual environment
Python memo
Python memo
[Python] Summary of how to use pandas
[Learning memo] Basics of class by python
[Python beginner memo] Python character string, path operation
Python application: Pandas Part 4: DataFrame concatenation / combination
Python data structure and operation (Python learning memo ③)
[Pandas_flavor] Add a method of Pandas DataFrame
Pandas of the beginner, by the beginner, for the beginner [Python]
[Python] Add total rows to Pandas DataFrame
Memo of troubles about coexistence of Python 2/3 system
[Python] Memo dictionary
Introduction of Python
My pandas (python)
python beginner memo (9.2-10)
[python] vector operation
python beginner memo (9.1)
[Python] Visualize the heat of Tokyo and XX prefectures (DataFrame usage memo)
Basics of Python ①
★ Memo ★ Python Iroha
Basics of python ①
Python OS operation
Memo of pixel position operation for image data in Python (numpy, cv2)
Copy of python
[Python] EDA memo
Python 3 operator memo
[Python] Matrix operation
Pandas operation memorandum
[My memo] python
Python3 metaclass memo
[Python] Basemap memo
Python beginner memo (2)
python pandas notes
[Python] Numpy memo
Introduction of Python
A memo of a tutorial on running python on heroku
Correspondence summary of array operation of ruby and python