How to set a charset in email using smtplib in Python 2.7?

f1nn picture f1nn · Apr 24, 2012 · Viewed 26.9k times · Source

I'm writing a simple smtp-sender with authentification. Here's my code

    SMTPserver, sender, destination = 'smtp.googlemail.com', '[email protected]', ['[email protected]']
    USERNAME, PASSWORD = "user", "password"

    # typical values for text_subtype are plain, html, xml
    text_subtype = 'plain'


    content="""
    Hello, world!
    """

    subject="Message Subject"

    from smtplib import SMTP_SSL as SMTP       # this invokes the secure SMTP protocol (port 465, uses SSL)
    # from smtplib import SMTP                  # use this for standard SMTP protocol   (port 25, no encryption)
    from email.MIMEText import MIMEText

    try:
        msg = MIMEText(content, text_subtype)
        msg['Subject']=       subject
        msg['From']   = sender # some SMTP servers will do this automatically, not all

        conn = SMTP(SMTPserver)
        conn.set_debuglevel(False)
        conn.login(USERNAME, PASSWORD)
        try:
            conn.sendmail(sender, destination, msg.as_string())
        finally:
            conn.close()

    except Exception, exc:
        sys.exit( "mail failed; %s" % str(exc) ) # give a error message

It works perfect, untill I try to send non-ascii symbols (russian cyrillic). How should i define a charset in a message to make it show in a proper way? Thanks in advance!

UPD. I've changed my code:

text_subtype = 'text'
content="<p>Текст письма</p>"
msg = MIMEText(content, text_subtype)
msg['From']=sender # some SMTP servers will do this automatically, not all
msg['MIME-Version']="1.0"
msg['Subject']="=?UTF-8?Q?Тема письма?="
msg['Content-Type'] = "text/html; charset=utf-8"
msg['Content-Transfer-Encoding'] = "quoted-printable"
…
conn.sendmail(sender, destination, str(msg))

So, first time I spectify text_subtype = 'text', and then in header I place a msg['Content-Type'] = "text/html; charset=utf-8" string. Is it correct?

UPDATE Finally, I've solved my message problem You should write smth like msg = MIMEText(content.encode('utf-8'), 'plain', 'UTF-8')

Answer

Lorcan O&#39;Neill picture Lorcan O'Neill · Jan 24, 2013
from email.header import Header
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText

def contains_non_ascii_characters(str):
    return not all(ord(c) < 128 for c in str)   

def add_header(message, header_name, header_value):
    if contains_non_ascii_characters(header_value):
        h = Header(header_value, 'utf-8')
        message[header_name] = h
    else:
        message[header_name] = header_value    
    return message

............
msg = MIMEMultipart('alternative')
msg = add_header(msg, 'Subject', subject)

if contains_non_ascii_characters(html):
    html_text = MIMEText(html.encode('utf-8'), 'html','utf-8')
else:
    html_text = MIMEText(html, 'html')    

if(contains_non_ascii_characters(plain)):
    plain_text = MIMEText(plain.encode('utf-8'),'plain','utf-8') 
else:
    plain_text = MIMEText(plain,'plain')

msg.attach(plain_text)
msg.attach(html_text)

This should give you your proper encoding for both text and headers regardless of whether your text contains non-ASCII characters or not. It also means you won't automatically use base64 encoding unnecessarily.