Get the Gmail attachment filename without downloading it

mopsiok picture mopsiok · Dec 1, 2012 · Viewed 9.5k times · Source

I'm trying to get all the messages from a Gmail account that may contain some large attachments (about 30MB). I just need the names, not the whole files. I found a piece of code to get a message and the attachment's name, but it downloads the file and then read its name:

import imaplib, email

#log in and select the inbox
mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login('username', 'password')
mail.select('inbox')

#get uids of all messages
result, data = mail.uid('search', None, 'ALL') 
uids = data[0].split()

#read the lastest message
result, data = mail.uid('fetch', uids[-1], '(RFC822)')
m = email.message_from_string(data[0][1])

if m.get_content_maintype() == 'multipart': #multipart messages only
    for part in m.walk():
        #find the attachment part
        if part.get_content_maintype() == 'multipart': continue
        if part.get('Content-Disposition') is None: continue

        #save the attachment in the program directory
        filename = part.get_filename()
        fp = open(filename, 'wb')
        fp.write(part.get_payload(decode=True))
        fp.close()
        print '%s saved!' % filename

I have to do this once a minute, so I can't download hundreds of MB of data. I am a newbie into the web scripting, so could anyone help me? I don't actually need to use imaplib, any python lib will be ok for me.

Best regards

Answer

rhettg picture rhettg · Dec 1, 2012

Rather than fetch RFC822, which is the full content, you could specify BODYSTRUCTURE.

The resulting data structure from imaplib is pretty confusing, but you should be able to find the filename, content-type and sizes of each part of the message without downloading the entire thing.