ntan
ntan

Reputation: 2205

php large xml parse it with paging

I have a large XML file with 22000 records that I have to import in my DB.

I am looking how to parse the xml with paging, meaning

parse.php?start=0; //this script get the 0-500 firt records of file
parse.php?start=500 //this script get the 500-1000 records of file

This way I can bypass memory problems.

My problem is how to point at record 500 when load the xml file

My code is simple

$data=simplexml_load_file($xmlFile);

foreach ($data->product as $product) {
   foreach($product->children() as $section) {
       addToDB($section);
   }
}

The code above works fine for 1000-2000 records but I want to modify as mentioned to work with large XMLs

Upvotes: 0

Views: 378

Answers (3)

Nikolay Gechev
Nikolay Gechev

Reputation: 1

Very high performed way is

$data = preg_split('/(<|>)/m', $xmlFile);

And after that, only one cycle is needed.

Upvotes: 0

Stefan Gehrig
Stefan Gehrig

Reputation: 83622

SimpleXML is a DOM parser which means that it must load the whole document into memory to be able to build an in-memory representation of the whole XML dataset. Chunking the data does not work with this type of parser.

To load XML datasets that large you must switch to so called pull parser*s such as the XMLReader for example or the very low-level XML Parser extension. Pull parsers work by traversing the XML document element by element and allow you, the developer, to react according to the currently parsed element. That reduces memory footprint because only small fragments of the data have to be loaded into memory at one time. Using pull parsers is a little bit uncommon and not as intuitive as the familiar DOM parsers (DOM and SimpleXML).

Upvotes: 1

cweiske
cweiske

Reputation: 31088

That's not possible.

You should use XMLReader to import large files as described in my blog post.

Upvotes: 0

Related Questions