Email body is a string sometimes and a list sometimes. Why?

None-da picture None-da · Feb 27, 2009 · Viewed 12k times · Source

My application is written in python. What I am doing is I am running a script on each email received by postfix and do something with the email content. Procmail is responsible for running the script taking the email as input. The problem started when I was converting the input message(may be text) to email_message object(because the latter comes in handy). I am using email.message_from_string (where email is the default email module, comes with python).

import email message = email.message_from_string(original_mail_content) message_body = message.get_payload()

This message_body is sometimes returning a list[email.message.Message instance,email.message.Message instance] and sometime returning a string(actual body content of the incoming email). Why is it. And even I found one more observation. When I was browsing through the email.message.Message.get_payload() docstring, I found this..
""" The payload will either be a list object or a string.If you mutate the list object, you modify the message's payload in place....."""

So how do I have generic method to get the body of email through python? Please help me out.

Answer

Ali Afshar picture Ali Afshar · Feb 27, 2009

Well, the answers are correct, you should read the docs, but for an example of a generic way:

def get_first_text_part(msg):
    maintype = msg.get_content_maintype()
    if maintype == 'multipart':
        for part in msg.get_payload():
            if part.get_content_maintype() == 'text':
                return part.get_payload()
    elif maintype == 'text':
        return msg.get_payload()

This is prone to some disaster, as it is conceivable the parts themselves might have multiparts, and it really only returns the first text part, so this might be wrong too, but you can play with it.