I tried various things. To conclude first, basically, you don't have to think too much because you can prevent garbled characters. Also, I don't really understand the behavior when playing with the character set with add_charset in Python3.
This is the main subject.
# -*- coding: utf-8 -*-
import smtplib
from email.mime.text import MIMEText
from email.header import Header
from email import charset
con = smtplib.SMTP('localhost')
cset = 'utf-8' # <---------------(It's a character set setting)
message = MIMEText(u'It's a Japanese email ★', 'plain', cset)
message['Subject'] = Header(u'Email sending test', cset)
message['From'] = '[email protected]'
message['To'] = '[email protected]'
con.sendmail('[email protected]', ['[email protected]'],
Let's try it.
Python2.7.2 + None
It's a sudden change ball, but I'll try the case where the character set is not set.
In the basic code, I tried cset = None
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-9: ordinal not in range(128)
I'm really angry. If you do not register any character set, it will be processed as us-ascii, so it will be moss somewhere.
Python2.7.2 + utf-8 (with BASE64) Try `` `cset = utf-8``` in the basic code. I was able to receive this safely. Raw data looks like this.
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Subject: =?utf-8?b?44Oh44O844Or6YCB5L+h44OG44K544OI?=
From: [email protected]
To: [email protected]
Reply-To: [email protected]
Body encoding is Base64. This is because the Python standard is as follows.
'utf-8': (SHORTEST, BASE64, 'utf-8'),
#Tuples are header encoding,It shows the body encoding and output encoding charset.It was written in py
Probably, there is almost no problem with this, but in the past, it was NG with au terminals. But I think this is all right. That's it.
Python2.7.2 + utf-8 with QP
I hate Base64! If so, overwrite CHARSET. Insert this near the beginning of the basic code.
sendmain.Write somewhere in py.py
charset.add_charset('utf-8', charset.SHORTEST, charset.QP, 'utf-8')
# uft-As a setting of 8, the header is SHORTEST and the body is QP (quoted).-printable)I'll use it, the output encoding is utf-8
cset = utf-8
When you do this, it looks like this:
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Subject: =?utf-8?b?44Oh44O844Or6YCB5L+h44OG44K544OI?=
From: [email protected]
To: [email protected]
Reply-To: [email protected]
It's something other than BASE64. There is no problem with reception.
Python2.7.2 + utf-8 with 8bit What if I don't specify anything for body encoding?
sendmain.Write somewhere in py.py
charset.add_charset('utf-8', charset.SHORTEST, None, 'utf-8')
cset = utf-8
The output is like this. It comes out as it is.
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Subject: =?utf-8?b?44Oh44O844Or6YCB5L+h44OG44K544OI?=
From: [email protected]
To: [email protected]
Reply-To: [email protected]
It's a Japanese email ★
Content-Transfer-Encoding can be 7bit or 8bit. This is in /email/encoders.py It has been decided that the function encode_7or8bit () is good. If you want to make it 8bit, this is it. Maybe this is quite a lot these days.
Python2.7.2 + shift_jis
IME-Version: 1.0
Content-Type: text/plain; charset="iso-2022-jp"
Content-Transfer-Encoding: 7bit
Subject: =?iso-2022-jp?b?GyRCJWEhPCVrQXc/LiVGJTklSBsoQg==?=
From: [email protected]
To: [email protected]
Reply-To: [email protected]
When the character set is'shift_jis', the output is iso-2022-jp, which everyone loves. This is the standard setting of Python
'shift_jis': (BASE64, None, 'iso-2022-jp'),
Body encoding is None. Content-Transfer-Encoding is 7bit without permission.
Python3.3.0 + None Next, try with Python3. First, if you don't specify a character set. The one who got UnicodeEncodeError in Python2.
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Subject: =?utf-8?b?44Oh44O844Or6YCB5L+h44OG44K544OI?=
From: [email protected]
To: [email protected]
Reply-To: [email protected]
How can I send it? There is no problem with reception. It feels like I've read the contents a little, try it with us-ascii, and if I get a UnicodeEncodeError, try it with utf-8. So, with Python3.3, you can skip emails without having to be aware of the character set at all.
Python3.3.0 + utf-8 (with BASE64)
So, even if you do cset = utf-8
, it should be the same as above.
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Subject: =?utf-8?b?44Oh44O844Or6YCB5L+h44OG44K544OI?=
From: [email protected]
To: [email protected]
Reply-To: [email protected]
the same! Next!
Python3.3.0 + utf-8 with QP I want to use QP for BODY!
So, as with Python 2, write somewhere below.
sendmain.Write somewhere in py.py
charset.add_charset('utf-8', charset.SHORTEST, charset.QP, 'utf-8')
cset = utf-8
send e-mail!
self.set_payload(_text, _charset)
File "/Users/yasunori/.pythonbrew/pythons/Python-3.3.0/Frameworks/Python.framework/Versions/3.3/lib/python3.3/email/message.py", line 280, in set_payload
File "/Users/yasunori/.pythonbrew/pythons/Python-3.3.0/Frameworks/Python.framework/Versions/3.3/lib/python3.3/email/message.py", line 317, in set_charset
self._payload = charset.body_encode(self._payload)
File "/Users/yasunori/.pythonbrew/pythons/Python-3.3.0/Frameworks/Python.framework/Versions/3.3/lib/python3.3/email/charset.py", line 395, in body_encode
return email.quoprimime.body_encode(string)
File "/Users/yasunori/.pythonbrew/pythons/Python-3.3.0/Frameworks/Python.framework/Versions/3.3/lib/python3.3/email/quoprimime.py", line 240, in body_encode
if body_check(ord(c)):
File "/Users/yasunori/.pythonbrew/pythons/Python-3.3.0/Frameworks/Python.framework/Versions/3.3/lib/python3.3/email/quoprimime.py", line 81, in body_check
return chr(octet) != _QUOPRI_BODY_MAP[octet]
KeyError: 26085
I was angry! Scary scary scary scary! !! !! !! It is said that there is no such key in the array. As you can see, _QUOPRI_BODY_MAP should be about alphanumeric characters, but I'm trying to refer to the 26085th character. I wonder why ... I'm not sure at first glance, so I put it on hold.
Python3.3.0 + utf-8 with 8bit I want to send it as it is in 8bit.
sendmain.Write somewhere in py.py
charset.add_charset('utf-8', charset.SHORTEST, None, 'utf-8')
Add this and send.
File "/Users/yasunori/.pythonbrew/pythons/Python-3.3.0/Frameworks/Python.framework/Versions/3.3/lib/python3.3/smtplib.py", line 744, in sendmail
msg = _fix_eols(msg).encode('ascii')
UnicodeEncodeError: 'ascii' codec can't encode characters in position 231-240: ordinal not in range(128)
I was angry! Scary scary scary scary! !! !! !!
Why is this angry?
Actually, unlike the QP error, `message.as_string ()`
is passed, and the mail text is properly completed.
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Subject: =?cp932?b?g4GBW4OLkZeQTYNlg1iDZw==?=
From: [email protected]
To: [email protected]
Reply-To: [email protected]
It's a Japanese email ★
It's okay. Send it! I'm sometimes angry. Looking at the error part earlier, it seems to be useless because I am trying to encode with ascii solid writing in smtplib ... What should I do with this? Please tell me ...
Python3.3.0 + shift_jis
Stable sjis.
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-2022-jp"
Content-Transfer-Encoding: 7bit
Subject: =?iso-2022-jp?b?GyRCJWEhPCVrQXc/LiVGJTklSBsoQg==?=
From: [email protected]
To: [email protected]
Reply-To: [email protected]
It works almost as expected, but in Python3 series, the behavior when add_charset is quite suspicious and it is a demon gate so far. Am I doing it wrong? ??
Recommended Posts