remove html tags

Thomas picture Thomas · Dec 11, 2011 · Viewed 16.4k times · Source

Currently, I use strip_tags, to remove all html tags from the strings I process. However, I notice lately, that it joins words, which contained in the tags removed ie

$str = "<li>Hello</li><li>world</li>";
$result = strip_tags($str);
echo $result;
(prints HelloWorld)

How can you get around this?

Answer

kontur picture kontur · Dec 11, 2011

This would replace all html tags (anything in the form of < ABC >, in fact, without check if it truly is html) with a whitespace, then replace possible double whitespaces to single whitespaces and remove starting or ending whitespaces.

$str = preg_replace("/<.*?>/", " ", $str);
$str = trim(str_replace("  ", " ", $str));