How to send 100,000 emails weekly?

xRobot picture xRobot · Oct 11, 2010 · Viewed 160.2k times · Source

How can one send an email to 100,000 users on a weekly basis in PHP? This includes mail to subscribers using the following providers:

  • AOL
  • G-Mail
  • Hotmail
  • Yahoo

It is important that all e-mail actually be delivered, to the extent that it is possible. Obviously, just sending the mail conventionally would do nothing but create problems.

Is there a library for PHP that makes this simpler?

Answer

Piskvor left the building picture Piskvor left the building · Oct 11, 2010

Short answer: While it's technically possible to send 100k e-mails each week yourself, the simplest, easiest and cheapest solution is to outsource this to one of the companies that specialize in it (I did say "cheapest": there's no limit to the amount of development time (and therefore money) that you can sink into this when trying to DIY).

Long answer: If you decide that you absolutely want to do this yourself, prepare for a world of hurt (after all, this is e-mail/e-fail we're talking about). You'll need:

  • e-mail content that is not spam (otherwise you'll run into additional major roadblocks on every step, even legal repercussions)
  • in addition, your content should be easy to distinguish from spam - that may be a bit hard to do in some cases (I heard that a certain pharmaceutical company had to all but abandon e-mail, as their brand names are quite common in spams)
  • a configurable SMTP server of your own, one which won't buckle when you dump 100k e-mails onto it (your ISP's upstream server won't be sufficient here and you'll make the ISP violently unhappy; we used two dedicated boxes)
  • some mail wrapper (e.g. PhpMailer if PHP's your poison of choice; using PHP's mail() is horrible enough by itself)
  • your own sender function to run in a loop, create the mails and pass them to the wrapper (note that you may run into PHP's memory limits if your app has a memory leak; you may need to recycle the sending process periodically, or even better, decouple the "creating e-mails" and "sending e-mails" altogether)

Surprisingly, that was the easy part. The hard part is actually sending it:

  • some servers will ban you when you send too many mails close together, so you need to shuffle and watch your queue (e.g. send one mail to [email protected], then three to other domains, only then another to [email protected])
  • you need to have correct PTR, SPF, DKIM records
  • handling remote server timeouts, misconfigured DNS records and other network pleasantries
  • handling invalid e-mails (and no, regex is the wrong tool for that)
  • handling unsubscriptions (many legitimate newsletters have been reclassified as spam due to many frustrated users who couldn't unsubscribe in one step and instead chose to "mark as spam" - the spam filters do learn, esp. with large e-mail providers)
  • handling bounces and rejects ("no such mailbox [email protected]","mailbox [email protected] full")
  • handling blacklisting and removal from blacklists (Sure, you're not sending spam. Some recipients won't be so sure - with such large list, it will happen sometimes, no matter what precautions you take. Some people (e.g. your not-so-scrupulous competitors) might even go as far to falsely report your mailings as spam - it does happen. On average, it takes weeks to get yourself removed from a blacklist.)

And to top it off, you'll have to manage the legal part of it (various federal, state, and local laws; and even different tangles of laws once you send outside the U.S. (note: you have no way of finding if [email protected] lives in Southwest Elbonia, the country with world's most draconian antispam laws)).

I'm pretty sure I missed a few heads of this hydra - are you still sure you want to do this yourself? If so, there'll be another wave, this time merely the annoying problems inherent in sending an e-mail. (You see, SMTP is a store-and-forward protocol, which means that your e-mail will be shuffled across many SMTP servers around the Internet, in the hope that the next one is a bit closer to the final recipient. Basically, the e-mail is sent to an SMTP server, which puts it into its forward queue; when time comes, it will forward it further to a different SMTP server, until it reaches the SMTP server for the given domain. This forward could happen immediately, or in a few minutes, or hours, or days, or never.) Thus, you'll see the following issues - most of which could happen en route as well as at the destination:

  • the remote SMTP servers don't want to talk to your SMTP server
  • your mails are getting marked as spam (<blink> is not your friend here, nor is <font color=...>)
  • your mails are delivered days, even weeks late (contrary to popular opinion, SMTP is designed to make a best effort to deliver the message sometime in the future - not to deliver it now)
  • your mails are not delivered at all (already sent from e-mail server on hop #4, not sent yet from server on hop #5, the server that currently holds the message crashes, data is lost)
  • your mails are mangled by some braindead server en route (this one is somewhat solvable with base64 encoding, but then the size goes up and the e-mail looks more suspicious)
  • your mails are delivered and the recipients seem not to want them ("I'm sure I didn't sign up for this, I remember exactly what I did a year ago" (of course you do, sir))
  • users with various versions of Microsoft Outlook and its special handling of Internet mail
  • wizard's apprentice mode (a self-reinforcing positive feedback loop - in other words, automated e-mails as replies to automated e-mails as replies to...; you really don't want to be the one to set this off, as you'd anger half the internet at yourself)

and it'll be your job to troubleshoot and solve this (hint: you can't, mostly). The people who run a legit mass-mailing businesses know that in the end you can't solve it, and that they can't solve it either - and they have the reasons well researched, documented and outlined (maybe even as a Powerpoint presentation - complete with sounds and cool transitions - that your bosses can understand), as they've had to explain this a million times before. Plus, for the problems that are actually solvable, they know very well how to solve them.

If, after all this, you are not discouraged and still want to do this, go right ahead: it's even possible that you'll find a better way to do this. Just know that the road ahead won't be easy - sending e-mail is trivial, getting it delivered is hard.