Resolve Japanese write error UnicodeEncodeError in Python files

I think that the handling of character codes can always be mentioned as a demon gate of Python. When I first started using Python, I was tired of handling this character code and would I use it again! There was a time when I thought. I'm used to it now. .. ..

What were you trying to do

  1. Get the query log with the DB API with a Python script.
  2. Write to a file.

It's just this.

If there was a Japanese comment in the query log, the UnicodeEncodeError at the beginning might occur when writing.

I will write the situation and solution. Python is 2.7. I'm sorry this time. .. .. ..

It was okay when I checked ...

When I got it with DB API, I was able to get the query log without any problem and deleted unnecessary character strings. I was also able to write a trial log! So I thought, "Oh, then I just write it to a file!"

Now let's write that to a file.

# -*- coding: utf-8 -*-

#Actually, the log is acquired by API, but here we will log it as a character string to check the operation.
log = "aaa Japanese"

with open("test.txt", "a") as f:
    f.write(log + "\r\n")

Looking at the file test.txt

aaa Japanese

It is written properly. Now all you have to do is get the query log from the API and run it. I was wondering. So when I get the log from the API and write it, the following error occurs. .. (I'm sorry, but I will omit the API part)


Traceback (most recent call last):
  File "writetest.py", line 11, in <module>
    f.write(log + "\r\n")
UnicodeEncodeError: 'ascii' codec can't encode characters in position 4-6: ordinal not in range(128)

__ In conclusion, the string obtained by API was handled as Unicode type. Apparently in python2, when writing with write, it seems to be an error caused by trying to write with the default ascii. __ It hasn't been investigated in great detail, but it seems like that.

How did you solve it?

log = log.encode("utf_8")

I converted it to utf-8 once and wrote it.


# -*- coding: utf-8 -*-

#Get logs with API
log = (Get logs with API)

log = log.encode("utf_8")

with open("test.txt", "a") as f:
    f.write(log + "\r\n")

By the way,

f.write(log.encode("utf_8") + "\r\n")

But it was a similar error.

Other solutions

Including the above solution

There seems to be a method such as.

Unicode type behavior

By the way, let's check the simple operation of Unicode type.

# coding: utf-8

str_1 = "Japanese"
str_2 = u"Japanese"

print str_1
print str_2

print type(str_1)
print type(str_2)

print len(str_1)
print len(str_2)

print ("Book" in str_1)
print (u"Book" in str_2)

print str_1.find("Book")
print str_2.find(u"Book")

When you do this

Japanese# print str_1
Japanese# print str_2
<type 'str'>       # print type(str_1)
<type 'unicode'>   # print type(str_2)
9                  # print len(str_1)
3                  # print len(str_2)
True               # print ("Book" in str_1)
True               # print (u"Book" in str_2)
3                  # print str_1.find("Book")
1                  # print str_2.find(u"Book")

A quick look, In str type, the character string is handled in byte format. The unicode type is treated as a character (which one can intuitively judge) You can see that.

From this, I think that unicode is easier to handle when dealing with character strings.

For that reason, you can understand that the log when the log was acquired by the API by python at the beginning was acquired as unicode type.

By the way, I didn't know this movement at all.

Recommended Posts

Resolve Japanese write error UnicodeEncodeError in Python files
Read and write JSON files in Python
Write Python in MySQL
Handle zip files with Japanese filenames in Python 3
Python error list (Japanese)
Japanese output in Python
To write to Error Repoting in Python on GAE
I wrote python in Japanese
Write Pandoc filters in Python
Write beta distribution in Python
Write python in Rstudio (reticulate)
#python Python Japanese syntax error avoidance
Slice error in python (´ ; ω ; `)
I understand Python in Japanese!
Get Japanese synonyms in Python
Transpose CSV files in Python Part 1
Write a binary search in Python
Write JSON Schema in Python DSL
Python is UnicodeEncodeError in CodeBox docker
Write an HTTP / 2 server in Python
Write AWS Lambda function in Python
Manipulate files and folders in Python
Write A * (A-star) algorithm in Python
Handling of JSON files in Python
Download Google Drive files in Python
Sort large text files in Python
Write selenium test code in python
Write a pie chart in Python
Write a vim plugin in Python
Write a depth-first search in Python
Read files in parallel with Python
Export and output files in Python
Write C unit tests in Python
How to handle Japanese in Python
Extract strings from files in Python
Write documentation in Sphinx with Python Livereload
How to resolve "No kernel of grammar Python found" error in Atom
Find files like find on linux in Python
Output tree structure of files in Python
Write the test in a python docstring
Type annotations for Python2 in stub files!
[Illegal hardware instruction python] error in PyMC3
Decorator to avoid UnicodeEncodeError in Python 3 print ()
Write a short property definition in Python
Resolve the Address already in use error
Write O_SYNC file in C and Python
Referencing INI files in Python or Ruby
Write a Caesar cipher program in Python
Automate jobs by manipulating files in Python
Dictionary key error → Resolve with key in dicionary
Write a simple greedy algorithm in Python
Sample for handling eml files in Python
Write python modules in fortran using f2py
Write a simple Vim Plugin in Python 3
Download files in any format using Python
How to write Ruby to_s in Python
Import Error in Python3: No module named'xxxxx'
Convert FBX files to ASCII <-> BINARY in Python
Summary of how to import files in Python 3
Let's write FizzBuzz with an error: Python Version
ModuleNotFoundError: No module named'_bz2' error in pyenv Python