I've got a webhook posting to a form on my web application and I need to parse out the email header addresses.
Here is the source text:
Thread-Topic: test subject
Thread-Index: AcwE4mK6Jj19Hgi0SV6yYKvj2/HJbw==
From: "Lastname, Firstname" <[email protected]>
To: <[email protected]>, [email protected], [email protected]
Cc: <[email protected]>, [email protected]
X-OriginalArrivalTime: 27 Apr 2011 13:52:46.0235 (UTC) FILETIME=[635226B0:01CC04E2]
I'm looking to pull out the following:
<[email protected]>, [email protected], [email protected]
I'm been struggling with Regex all day without any luck.
Contrary to some of the posts here I have to agree with mmutz, you cannot parse emails with a regex... see this article:
http://tools.ietf.org/html/rfc2822#section-3.4.1
3.4.1. Addr-spec specification
An addr-spec is a specific Internet identifier that contains a locally interpreted string followed by the at-sign character ("@", ASCII value 64) followed by an Internet domain.
The idea of "locally interpreted" means that only the receiving server is expected to be able to parse it.
If I were going to try and solve this I would find the "To" line contents, break it apart and attempt to parse each segment with System.Net.Mail.MailAddress.
static void Main()
{
string input = @"Thread-Topic: test subject
Thread-Index: AcwE4mK6Jj19Hgi0SV6yYKvj2/HJbw==
From: ""Lastname, Firstname"" <[email protected]>
To: <[email protected]>, ""Yes, this is valid""@[emails are hard to parse!], [email protected], [email protected]
Cc: <[email protected]>, [email protected]
X-OriginalArrivalTime: 27 Apr 2011 13:52:46.0235 (UTC) FILETIME=[635226B0:01CC04E2]";
Regex toline = new Regex(@"(?im-:^To\s*:\s*(?<to>.*)$)");
string to = toline.Match(input).Groups["to"].Value;
int from = 0;
int pos = 0;
int found;
string test;
while(from < to.Length)
{
found = (found = to.IndexOf(',', from)) > 0 ? found : to.Length;
from = found + 1;
test = to.Substring(pos, found - pos);
try
{
System.Net.Mail.MailAddress addy = new System.Net.Mail.MailAddress(test.Trim());
Console.WriteLine(addy.Address);
pos = found + 1;
}
catch (FormatException)
{
}
}
}
Output from the above program:
[email protected]
"Yes, this is valid"@[emails are hard to parse!]
[email protected]
[email protected]