How to Get Body of email from Pipe to program

vizenor picture vizenor · Apr 20, 2011 · Viewed 8.9k times · Source

I am piping an email to a program and running some code.

**

I know how to get the "From:" and the "Subject:" but how do I get only the body of the email?

**

#!/usr/bin/php -q
<?

$fd = fopen("php://stdin", "r");
while (!feof($fd)) {
  $email .= fread($fd, 1024);
}
fclose($fd);

$lines = explode("\n", $email);

for ($i=0; $i < count($lines); $i++) 
{


    // look out for special headers
    if (preg_match("/Subject:/", $lines[$i], $matches)) 
        {

    list($One,$Subject) = explode("Subject:", $lines[$i]);    
    list($Subject,$Gone) = explode("<", $Subject);  


        }

etc... HOW DO I GET THE BODY CONTENT OF THE EMAIL?

Answer

Jared Farrish picture Jared Farrish · Apr 20, 2011

Basically, you want where the headers end, and to know if it's multipart or not so you can get the right portion(s) of the email.

Here is some information:

parsing raw email in php

Which says that the first double newline should be the beginning of the body of the email.

This page might give you some other ideas (see script below):

http://thedrupalblog.com/configuring-server-parse-email-php-script

#!/usr/bin/php
<?php

// fetch data from stdin
$data = file_get_contents("php://stdin");

// extract the body
// NOTE: a properly formatted email's first empty line defines the separation between the headers and the message body
list($data, $body) = explode("\n\n", $data, 2);

// explode on new line
$data = explode("\n", $data);

// define a variable map of known headers
$patterns = array(
  'Return-Path',
  'X-Original-To',
  'Delivered-To',
  'Received',
  'To',
  'Message-Id',
  'Date',
  'From',
  'Subject',
);

// define a variable to hold parsed headers
$headers = array();

// loop through data
foreach ($data as $data_line) {

  // for each line, assume a match does not exist yet
  $pattern_match_exists = false;

  // check for lines that start with white space
  // NOTE: if a line starts with a white space, it signifies a continuation of the previous header
  if ((substr($data_line,0,1)==' ' || substr($data_line,0,1)=="\t") && $last_match) {

    // append to last header
    $headers[$last_match][] = $data_line;
    continue;

  }

  // loop through patterns
  foreach ($patterns as $key => $pattern) {

    // create preg regex
    $preg_pattern = '/^' . $pattern .': (.*)$/';

    // execute preg
    preg_match($preg_pattern, $data_line, $matches);

    // check if preg matches exist
    if (count($matches)) {

      $headers[$pattern][] = $matches[1];
      $pattern_match_exists = true;
      $last_match = $pattern;

    }

  }

  // check if a pattern did not match for this line
  if (!$pattern_match_exists) {
    $headers['UNMATCHED'][] = $data_line;
  }

}

?>

EDIT

Here is a PHP extension called MailParse:

http://pecl.php.net/package/mailparse

Somebody has built a class around it called MimeMailParse:

http://code.google.com/p/php-mime-mail-parser/

And here is a blog entry discussing how to use it:

http://www.bucabay.com/web-development/a-php-mime-mail-parser-using-mailparse-extension/