So I've read through this question on SO but it does not quite help me any. I want to import a Gmail generated mbox file into another webmail service, but the problem is it only allows 40 MB huge files per import.
So I somehow have to split the mbox file into max. 40 MB big files and import them one after another. How would you do this?
My initial thought was to use the other script (formail
) to save each mail as a single file and afterwards run a script to combine them to 40 MB huge files, but still I wouldnt know how to do this using the terminal.
I also looked at the split
command, but Im afraid it would cutoff mails.
Thanks for any help!
If your mbox
is in standard format, each message will begin with From
and a space:
From [email protected]
So, you could COPY YOUR MBOX TO A TEMPORARY DIRECTORY
and try using awk
to process it, on a message-by-message basis, only splitting at the start of any message. Let's say we went for 1,000 messages per output file:
awk 'BEGIN{chunk=0} /^From /{msgs++;if(msgs==1000){msgs=0;chunk++}}{print > "chunk_" chunk ".txt"}' mbox
then you will get output files called chunk_1.txt
to chunk_n.txt
each containing up to 1,000 messages.
If you are unfortunate enough to be on Windows (which is incapable of understanding single quotes), you will need to save the following in a file called awk.txt
BEGIN{chunk=0} /^From /{msgs++;if(msgs==1000){msgs=0;chunk++}}{print > "chunk_" chunk ".txt"}
and then type
awk -f awk.txt mbox