I loaded a csv file containing Japanese values using Python's csv module Problems under the following conditions and their solutions
sys.getdefaultencoding ()
: asciiFirst read normally
The format of csv is
Keep it in the form integer, string
python
class data:
def __init__(self, id, name):
self.id = id
self.name = name
import csv
csvfile = open(filename)
reader = csv.reader(csvfile)
rows = [data(row[0], row[1]) for row in reader]
You have now read the data for all rows Next, ** extract data that contains any Japanese string in the name **
python
text = raw_input()
result = [row for row in rows if text in row.name]
The problem arises here. The string obtained by raw_input ()
is ** unicode type **
However, what you get with data.name
is the ** utf-8 str type ** read by csv.reader.
Of course, comparison is not possible, so an error occurs at ʻif text in row.name`.
The solution is
I think there are two possibilities, but since it is unicode that is easy to handle, I will use the former this time. In this case, what needs to be fixed is
python
rows = [data(row[0], row[1]) for row in reader]
is. Do this here
python
rows = [data(row[0], row[1].decode('utf-8')) for row in reader]
str # decode
is an arbitrary character encoding that converts str type to unicode type. This time the original is utf-8 str, so I decoded it with utf-8 and changed it to unicode.
Decoding needs to be changed depending on the character encoding of the original csv file (of course)
Now you can compare unicodes. Congratulations When you play with python on Windows, you often run into this kind of character encoding problem.
Recommended Posts