Split sentence into words

Guno picture Guno · Aug 8, 2013 · Viewed 14.6k times · Source

for example i have sentenes like this:

$text = "word, word w.d. word!..";

I need array like this

Array
(
    [0] => word
    [1] => word
    [2] => w.d
    [3] => word".
)

I am very new for regular expression..

Here is what I tried:

function divide_a_sentence_into_words($text){ 
    return preg_split('/(?<=[\s])(?<!f\s)\s+/ix', $text, -1, PREG_SPLIT_NO_EMPTY); 
}

this

$text = "word word, w.d. word!..";
$split = preg_split("/[^\w]*([\s]+[^\w]*|$)/", $text, -1, PREG_SPLIT_NO_EMPTY);
print_r($split);

works, but i have second question i want to write list in mu regular exppression "w.d" is special case.. for example this words is my list "w.d" , "mr.", "dr."

if i will take text:

$text = "word, dr. word w.d. word!..";

i need array:

Array (
  [0] => word
  [1] => dr.
  [2] => word
  [3] => w.d
  [4] => word 
)

sorry for bad english...

Answer

h2ooooooo picture h2ooooooo · Aug 8, 2013

Using preg_split with a regex of /[^\w]*([\s]+[^\w]*|$)/ should work fine:

<?php
    $text = "word word w.d. word!..";
    $split = preg_split("/[^\w]*([\s]+[^\w]*|$)/", $text, -1, PREG_SPLIT_NO_EMPTY);
    print_r($split);
?>

DEMO

Output:

Array
(
    [0] => word
    [1] => word
    [2] => w.d
    [3] => word
)