[Python] Read Japanese csv with pandas without garbled characters (and extract columns written in Japanese)

I am studying for Python 2.7.6 with pycharm.

This time, I read the csv file written in Japanese to be analyzed A garbled file was output. As a result of various investigations, it was solved by the following method.

Why the csv file is garbled

Garbled files can be created in the same way as the previously created Entry. I did it.

import pandas as pd
import os

#Specify the path to the working directory where the data is stored
os.chdir("/File path to directory")
#read csv
df= pd.read_csv("japanese.csv")
print df

However, when reading csv, the character code was not specified, so The garbled file was successfully read.

How to read Japanese csv file without garbled characters

So, when I read the csv file with SHIFT-JIS by the following method, The csv file was displayed in Japanese!

import pandas as pd
import os

#Specify the path to the working directory where the data is stored
os.chdir("/File path to directory")
#Reading csv with specified character code
df= pd.read_csv("japanese.csv",encoding="SHIFT-JIS")
print df

Now that the Japanese csv file has been read in this way, it's time to process the data! However, while playing with the character code, I changed the column specified in Japanese. I couldn't get it and struggled again ...

Extract only columns written in Japanese

Explicitly specify column to fetch only the columns you want loc [:, "desired column name"] If you write, it will get all the columns under "desired column name". Here has a detailed explanation. As a result, if you write as follows, The column where "the name of the column you want" is written I was able to get it gently!

import pandas as pd
import os

#Specify the path to the working directory where the data is stored
os.chdir("/File path to directory")
#Reading csv with specified character code
df= pd.read_csv("japanese.csv,encoding="SHIFT-JIS"")
column = df.loc[:,[u'The name of the column you want']]
print column 

I've stumbled on the basics this time as well, If anyone has the same problem, it would be helpful ...

Recommended Posts

[Python] Read Japanese csv with pandas without garbled characters (and extract columns written in Japanese)
Read csv with python pandas
Eliminate garbled Japanese characters in Python library matplotlib and NetworkX
[python] Extract text from pdf and read characters aloud with Open-Jtalk
Load csv with duplicate columns in pandas
Read CSV and analyze with Pandas and Seaborn
Read and format a csv file mixed with comma tabs with Python pandas
Read Python csv data with Pandas ⇒ Graph with Matplotlib
Read JSON with Python and output as CSV
Create an image with characters in python (Japanese)
[Python] Read the csv file and display the figure with matplotlib
Extract bigquery dataset and table list with python and output as CSV
[Python3] Save the mean and covariance matrix in json with pandas
Extract specific multiple columns with pandas
Read files in parallel with Python
Reading and writing CSV with Python
Ignore # line and read in pandas
Extract database tables with CSV [ODBC connection from R and python]
Create and read messagepacks in Python
Fill the string with zeros in python and count some characters from the string
Read CSV file with Python and convert it to DataFrame as it is
Join data with main key (required) and subkey (optional) in Python pandas
Create "operation log" CSV formatting tool in 5 days with Python Pandas PyInstaller
Stress Test with Locust written in Python
[Note] Japanese characters are garbled with atom-runner
Read CSV file with python (Download & parse CSV file)
Read and write csv files with numpy
Dealing with "years and months" in Python
Read Python csv and export to txt
Read and write JSON files in Python
Windows Qt5.4 Python3.4 QProcess Japanese garbled characters
How to read CSV files in Pandas
Load csv with pandas and play with Index
Adding Series to columns in python pandas
Read text in images with python OCR
[Introduction to Pandas] Read a csv file without a column name and give it a column name
Python basics basic course CSV processing (functions and classes, part 1 CSV is read and written)
How to read a CSV file with Python 2/3
Scraping tabelog with python and outputting to CSV
Japanese text preprocessing without for statement in pandas
Read a file containing garbled lines in Python
[Automation] Extract the table in PDF with Python
[Python] How to read excel file with pandas
Read table data in PDF file with Python
Reading and writing CSV and JSON files in Python
Handle zip files with Japanese filenames in Python 3
[Python3] Read and write with datetime isoformat with json
Extract zip with Python (Japanese file name support)
Extract email attachments received in Thunderbird with Python
Fix garbled characters when handling Japanese in Requests
[Python] How to handle Japanese characters with openCV
Add totals to rows and columns in pandas
Example of reading and writing CSV with Python
Compare read / write speed and capacity of csv, pickle, joblib, parquet in python environment
Consolidate a large number of CSV files in folders with python (data without header)
Csv in python
Full-width and half-width processing of CSV data in Python
Japanese can be used with Python in Docker environment
Read and analyze arff format dataset with python scipy.io
Read the csv file and display it in the browser
Read the linked list in csv format with graph-tool