How to remove html special chars?

Prashant picture Prashant · Mar 18, 2009 · Viewed 219.5k times · Source

I am creating a RSS feed file for my application in which I want to remove HTML tags, which is done by strip_tags. But strip_tags is not removing HTML special code chars:

  & © 

etc.

Please tell me any function which I can use to remove these special code chars from my string.

Answer

schnaader picture schnaader · Mar 18, 2009

Either decode them using html_entity_decode or remove them using preg_replace:

$Content = preg_replace("/&#?[a-z0-9]+;/i","",$Content); 

(From here)

EDIT: Alternative according to Jacco's comment

might be nice to replace the '+' with {2,8} or something. This will limit the chance of replacing entire sentences when an unencoded '&' is present.

$Content = preg_replace("/&#?[a-z0-9]{2,8};/i","",$Content);