Regex to match . (periods marking end of sentences) but not Mr. (as in Mr. Hopkins)

Josh Crews picture Josh Crews · May 31, 2010 · Viewed 27.5k times · Source

I'm trying to parse a text file into sentences ending in periods, but names like Mr. Hopkins are throwing false alarms on matching for periods.

What regex identifies "." but not "Mr."

For bonus, I'm also using ! to find end of sentences, so my current Regex is /(!/./ and I'd love an answer that incorporates my !'s too.

Answer

Amarghosh picture Amarghosh · Jun 1, 2010

Use negative look behind.

(?<!Mr|Mrs|Dr|Ms)\.

This will match a period only if it does not come after Mr, Mrs, Dr or Ms

<?
   $str = "This is Mr. Someone and Mrs. Somebody. They are here to meet Dr. SomeoneElse.";
   $str = preg_replace("/(?<!Mr|Mrs|Dr|Ms)\\./", "\n", $str);
   echo($str);
?>
//outputs:
This is Mr. Someone and Mrs. Somebody
 They are here to meet Dr. SomeoneElse