Can simplexml be used to rifle through html?

chris picture chris · Jul 9, 2011 · Viewed 12.3k times · Source

I would like to grab data from a table without using regular expressions. I've enjoyed using simplexml for parsing RSS feeds and would like to know if it can be used to grab a table from another page.

Eg. Grab the page with curl or simply file_get_contents(); then use simplexml to grab contents?

Answer

phihag picture phihag · Jul 9, 2011

You can use the loadHTML function from the DOM module, and then import that DOM into SimpleXML via simplexml_import_dom:

$html = file_get_contents('http://example.com/');
$doc = new DOMDocument();
$doc->loadHTML($html);
$sxml = simplexml_import_dom($doc);