Best way to process large XML in PHP

Petruza picture Petruza · Jul 22, 2009 · Viewed 50.1k times · Source

I have to parse large XML files in php, one of them is 6.5 MB and they could be even bigger. The SimpleXML extension as I've read, loads the entire file into an object, which may not be very efficient. In your experience, what would be the best way?

Answer

Eric Petroelje picture Eric Petroelje · Jul 22, 2009

For a large file, you'll want to use a SAX parser rather than a DOM parser.

With a DOM parser it will read in the whole file and load it into an object tree in memory. With a SAX parser, it will read the file sequentially and call your user-defined callback functions to handle the data (start tags, end tags, CDATA, etc.)

With a SAX parser you'll need to maintain state yourself (e.g. what tag you are currently in) which makes it a bit more complicated, but for a large file it will be much more efficient memory wise.