[PYTHON] What to do when UnicodeDecodeError occurs during read_csv in pandas (pd.read

When reading a CSV file with pandas, it is very convenient because you only need to read_csv.

import pandas as pd
pd.read_csv("file/to/path")

Normally, there is no problem with the above, but if there are bad characters in the CSV, the following error will be thrown.

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x83 in position 0: invalid start byte

It seems that he is angry, saying "I can't decode it."

Since the character code of CSV created in Excel is "shift-jis", I will try to specify with ʻencoding` of reading for the time being,

import pandas as import pd
pd.read_csv("file/to/path", encoding="shift-jis")

After all it is an error. That's right.

UnicodeDecodeError: 'shift_jis' codec can't decode byte 0x87 in position 0: illegal multibyte sequence

As a solution, it seems that you can read it by specifying ʻignore in codecs.open, ignoring the error, opening it, and pd.read_table`.

with codecs.open("file/to/path", "r", "Shift-JIS", "ignore") as file:
    df = pd.read_table(file, delimiter=",")
    print(df)

It seems that you can pass it as a StreamReaderWriter object as it is without doing file.read ().

I was addicted to it, so I took notes.

Recommended Posts

What to do when UnicodeDecodeError occurs during read_csv in pandas (pd.read_table ())

What to do if a UnicodeDecodeError occurs in pip

What to do if "Unnamed: 0" is added in to_csv-> read_csv in pandas

What to do when ModuleNotFoundError: No module named'XXX' occurs in Python

UnicodeDecodeError in pandas read_csv

What to do if pipreqs results in UnicodeDecodeError

What to do when PermissionError of tempfile.mkstemp occurs

[OSX] [pyenv] What to do when an SSL error occurs in pip

[openpyxl] What to do when IllegalCharacterError appears in pandas.DataFrame.to_excel

[python] What to do when an error occurs in send_keys of headless chrome

What to do when SSL error occurs in pip in Windows10, miniconda, VScode environment

What to do when a Remove Error occurs when updating conda

What to do if a 0xC0000005 error occurs in tf.train.start_queue_runners ()

What to do when an error occurs with import _ssl

What to do when "SSL: CERTIFICATE_VERIFY_FAILED _ssl.c: 1056" appears in Python

What to do when "Invalid HTTP_HOST header" appears in Django

What to do when Ubuntu crashes

What to do when a Missing artifact occurs in a jar that is not defined in pom.xml

What to do if ʻObject arrays cannot be loaded when allow_pickle = False` occurs in numpy.load ()

[Beanstalk] What to do when an error occurs with import uuid

What to do when the value type is ambiguous in Python?

What to do when the result downloaded via scrapy is in English

What to do if an error occurs when importing numpy with VScode

What to do when the warning "The environment is in consistent ..." appears in the Anaconda environment

What to do when a warning message is displayed in pip list

[Python] What to do if you get a ModuleNotFoundError when importing pandas using Jupyter Notebook in Anaconda

What to do to get google spreadsheet in python

What to do if CERTIFICATE_VERIFY_FAILED occurs when nltk.download () is done on macOS pyhon

What to do when a warning appears around Python integration in Neovim's CheckHealth

What to do if a Unicode Encode Error occurs in Sublime Text Python

What to do when "TypeError: data type not understood" appears in python's numpy.zeros

What to do if abort is displayed when inputting camera video in OpenCV

What to do when [Errno 2] No such file or directory appears in Python

What to do when the graph does not appear in jupyter (ipython) notebook

What to do if a version error occurs in the selenium Chrome driver

[Python] Type Error:'WebElement' object is not iterable What to do when an error occurs

What I do when imitating embedded go in python

What to do if pip install fails in Xcode 5.1

[Go 1.13] What to do when unexpected directory layout: appears

UnicodeDecodeError: What to do when'shift_jis' codec can't decode byte

curl: (60) What to do when Issuer certificate is invalid.

What to do when is not in the sudoers file.This incident will be reported.

What to do when gdal_merge creates a huge file

What to do when only the window is displayed and nothing is displayed in pygame Note

What to do when raise ValueError, "unsupported hash type"

What to do if you get an error when importing matplotlib in Python (Mac)

What to do when Python starts up in Anaconda does not come out unexpectedly

What to do when "cannot import name xxx" [Python]

I want to do something in Python when I finish

What to do when you can't bind CaboCha to Python

What to do when there is no response due to Proxy setting in Python web scraping

What to do if you get an error when running "certbot renew" in CakePHP environment

What to do when no display name occurs when unittesting Python + Tkinter on Github Actions Memo

[AWS] What to do when you want to pip with Lambda

What to do if ʻarguments [0] .scrollIntoView ();` fails in python selenium

What to do when Japanese is not displayed on matplotlib

What to do if pip gives a DistributionError in Homebrew

What to do when PyCharm font is strange or garbled

What to do when Unalignable boolean Series provided as indexer

What to do if you get "coverage unknown" in Coveralls

What to do if package installation fails when deploying to heroku

[PYTHON] What to do when UnicodeDecodeError occurs during read_csv in pandas (pd.read_table ())