PHP limit text string NOT including html tags?

Jamie Carter picture Jamie Carter · Jul 1, 2010 · Viewed 9.9k times · Source

Here's what's NOT working for me:

 $string = 'I have a dog and his name is <a href="">Jack</a> and I love him very much because he\'s my favorite dog in the whole wide world and nothing could make me not love him, I think.';
 $limited = substr($string, 0, 100).'...';
 echo $string;

I want to limit the VISIBLE text to 100 characters, but using substr() is also including the non-visible text in the limit (<a href=""> and </a>) which takes up 41 of those available 100 characters.

Is there a way to limit the text so that the word "Jack" from the link would be included in the limit, but not <a href=""> or </a>?

Edit: I want to keep the link in the string, just not count it's length towards the limit..


Chris Harrison picture Chris Harrison · Dec 13, 2011

A function to truncate words in HTML code:

//+ Jonas Raoni Soares Silva
function truncate($text, $length, $suffix = '&hellip;', $isHTML = true) {
    $i = 0;
    $tags = array();
        preg_match_all('/<[^>]+>([^<]*)/', $text, $m, PREG_OFFSET_CAPTURE | PREG_SET_ORDER);
        foreach($m as $o){
            if($o[0][1] - $i >= $length)
            $t = substr(strtok($o[0][0], " \t\n\r\0\x0B>"), 1);
            // test if the tag is unpaired, then we mustn't save them
            if($t[0] != '/' && (!isset($simpleTags[$t])))
                $tags[] = $t;
            elseif(end($tags) == substr($t, 1))
            $i += $o[1][1] - $o[0][1];

    // output without closing tags
    $output = substr($text, 0, $length = min(strlen($text),  $length + $i));
    // closing tags
    $output2 = (count($tags = array_reverse($tags)) ? '</' . implode('></', $tags) . '>' : '');

    // Find last space or HTML tag (solving problem with last space in HTML tag eg. <span class="new">)
    $pos = (int)end(end(preg_split('/<.*>| /', $output, -1, PREG_SPLIT_OFFSET_CAPTURE)));
    // Append closing tags to output

    // Get everything until last space
    $one = substr($output, 0, $pos);
    // Get the rest
    $two = substr($output, $pos, (strlen($output) - $pos));
    // Extract all tags from the last bit
    preg_match_all('/<(.*?)>/s', $two, $tags);
    // Add suffix if needed
    if (strlen($text) > $length) { $one .= $suffix; }
    // Re-attach tags
    $output = $one . implode($tags[0]);

    //added to remove  unnecessary closure
    $output = str_replace('</!-->','',$output); 

    return $output;
