Smarter word-wrap in PHP for long words?

mowgli picture mowgli · Mar 22, 2012 · Viewed 31.9k times · Source

I'm looking for a way to make word-wrap in PHP a bit smarter. So it doesn't pre-break long words leaving any prior small words alone on one line.

Let's say I have this (the real text is always completely dynamic, this is just to show):

wordwrap('hello! heeeeeeeeeeeeeeereisaverylongword', 25, '<br />', true);

This outputs:

hello!
heeeeeeeeeeeeeeereisavery
longword

See, it leaves the small word alone on the first line. How can I get it to ouput something more like this:

hello! heeeeeeeeeeee
eeereisaverylongword

So it utilizes any available space on each line. I have tried several custom functions, but none have been effective (or they had some drawbacks).

Answer

cmbuckley picture cmbuckley · Mar 22, 2012

I've had a go at the custom function for this smart wordwrap:

function smart_wordwrap($string, $width = 75, $break = "\n") {
    // split on problem words over the line length
    $pattern = sprintf('/([^ ]{%d,})/', $width);
    $output = '';
    $words = preg_split($pattern, $string, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);

    foreach ($words as $word) {
        if (false !== strpos($word, ' ')) {
            // normal behaviour, rebuild the string
            $output .= $word;
        } else {
            // work out how many characters would be on the current line
            $wrapped = explode($break, wordwrap($output, $width, $break));
            $count = $width - (strlen(end($wrapped)) % $width);

            // fill the current line and add a break
            $output .= substr($word, 0, $count) . $break;

            // wrap any remaining characters from the problem word
            $output .= wordwrap(substr($word, $count), $width, $break, true);
        }
    }

    // wrap the final output
    return wordwrap($output, $width, $break);
}

$string = 'hello! too long here too long here too heeeeeeeeeeeeeereisaverylongword but these words are shorterrrrrrrrrrrrrrrrrrrr';
echo smart_wordwrap($string, 11) . "\n";

EDIT: Spotted a couple of caveats. One major caveat with this (and also with the native function) is the lack of multibyte support.