[PYTHON] Points to note when making pandas read csv of excel output

If you want python to read it, I would like the character format to be utf-8, Since there are various reasons on the data output side, there are many cases where the receiving side must convert and read.

The csv output in the Windows & Excel environment is Shift JIS. .. .. So, with pandas,

import pandas as pd
dataset1 = pd.read_csv("hogehoge.csv",encoding="shift_jis")

If you do it, you may not be able to read it properly if you think it's OK and be careful.

test.csv


Yamada,1000
Sato,2000
Yamamoto,3000

I can read this,

test2.csv


1,Yamada,1000
2,Takahashi,2000
3,Black 﨑,3000

Without exception, I get the following error. .. ..

UnicodeDecodeError: 'shift_jis' codec can't decode byte 0xfb in position 0: illegal multibyte sequence

This is in test2.csv, ・ Hashigodaka "** Taka " ・ Tachisaki " Saki **" It is caused by the mixture of windows extension strings such as. In order to read such characters, the character code must be cp932.

encoding='cp932'

Because there is such a thing, because it is windows, if you read it with shift_jis, it is not conscious that it is OK, From the beginning, it was said that if you read it with cp932, you will not have to worry about unnecessary troubles.

import pandas as pd
dataset1 = pd.read_csv("hogehoge.csv",encoding="cp932")

Recommended Posts

Points to note when making pandas read csv of excel output
How to output CSV of multi-line header with pandas
Points to note when updating to WSL2
How to read CSV files in Pandas
[Python] How to read a csv file (read_csv method of pandas module)
[Python] How to read excel file with pandas
Points to note when switching from NAOqi OS 2.4.3 to 2.5.5
Convert UTF-8 CSV files to read in Excel
Points to note when performing logistic regression with Statsmodels
Read CSV file: pandas
Points to note when deleting multiple elements from the List
[Python] How to output a pandas table to an excel file
How to read an Excel file (.xlsx) with Pandas [Python]
(Note) Points to be addicted to when installing Scilab on ArchLinux
Read csv with python pandas
Export pandas dataframe to excel
How to paste a CSV file into an Excel file using Pandas
A note I was addicted to when making a beep on Linux
[Note] How to deal with unicode error and No such file or directory (output table to excel file with pandas)
Output to csv file with Python
Etosetra related to read_csv of Pandas
[Django] Command to output QuerySet to csv
How to change multiple columns of csv in Pandas (Unixtime-> Japan Time)
Python Note: When you want to know the attributes of an object
Summary of processes often performed in Pandas 1 (CSV, Excel file related operations)
A note when looking for an alternative to pandas rolling for moving windows