I've got a bunch of HTML data that I'm writing to a PDF file using PHP. In the PDF, I want all of the HTML to be stripped and cleaned up. So for instance:
<ul>
<li>First list item</li>
<li>Second list item which is quite a bit longer</li>
<li>List item with apostrophe 's 's</li>
</ul>
Should become:
First list item
Second list item which is quite a bit longer
List item with apostrophe 's 's
However, if I simply use strip_tags()
, I get something like this:
First list item

Second list item which is quite a bit
longer

List item with apostrophe ’s ’s
Also note the indentation of the output.
Any tips on how to properly cleanup the HTML to nice, clean strings without messy whitespace and odd characters?
Thanks :)
The characters seems to be html entities. Try:
html_entity_decode( strip_tags( $my_html_code ) );