Reputation: 2205
I have a large XML file with 22000 records that I have to import in my DB.
I am looking how to parse the xml with paging, meaning
parse.php?start=0; //this script get the 0-500 firt records of file
parse.php?start=500 //this script get the 500-1000 records of file
This way I can bypass memory problems.
My problem is how to point at record 500 when load the xml file
My code is simple
$data=simplexml_load_file($xmlFile);
foreach ($data->product as $product) {
foreach($product->children() as $section) {
addToDB($section);
}
}
The code above works fine for 1000-2000 records but I want to modify as mentioned to work with large XMLs
Upvotes: 0
Views: 378
Reputation: 1
Very high performed way is
$data = preg_split('/(<|>)/m', $xmlFile);
And after that, only one cycle is needed.
Upvotes: 0
Reputation: 83622
SimpleXML
is a DOM parser which means that it must load the whole document into memory to be able to build an in-memory representation of the whole XML dataset. Chunking the data does not work with this type of parser.
To load XML datasets that large you must switch to so called pull parser*s such as the XMLReader
for example or the very low-level XML Parser
extension. Pull parsers work by traversing the XML document element by element and allow you, the developer, to react according to the currently parsed element. That reduces memory footprint because only small fragments of the data have to be loaded into memory at one time. Using pull parsers is a little bit uncommon and not as intuitive as the familiar DOM parsers (DOM
and SimpleXML
).
Upvotes: 1
Reputation: 31088
That's not possible.
You should use XMLReader to import large files as described in my blog post.
Upvotes: 0