Armin Sam
Armin Sam

Reputation: 933

Count all elements of a certain name in an XML file using PHP

Given the XML below:

<Items>
    <Item>...</Item>
    <Item>...</Item>
    <Item>...</Item>
    <Item>...</Item>
</Items>

I am writing a function to return count of all <Item> elements (4 in this case). The actual XML file is huge and I don't want to load the entire thing in memory in order to parse it.

Using command line, I managed to get what I need with the following line:

grep "<Item>" my_file.xml -o | wc -l

Is there an equivalent solution in PHP that I can use to get the same result?

Upvotes: 2

Views: 199

Answers (1)

Ruslan Osmanov
Ruslan Osmanov

Reputation: 21492

It is easily done with XPath:

$doc = new DOMDocument();
$doc->load('my_file.xml', LIBXML_PARSEHUGE);

$xp = new DOMXPath($doc);
$count = $xp->evaluate('count(//Item)');

The XPath expression returns the number of all Item tags in the document.

The LIBXML_PARSEHUGE option only affects internal limits on the depth, entity recursion, and the size of text nodes. However, the DOM parser loads the entire document into memory.

For really huge files, use a SAX parser, which operates on each piece of XML sequentially (and thus loads only a small portion of the document into memory):

$counter = 0;

$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, function ($parser, $name) use (&$counter) {
  if ($name === 'ITEM') {
    $counter++;
  }
}, null);

if (!($fp = fopen('my_file.xml', 'r'))) {
  die('Could not open XML input');
}

while ($data = fread($fp, 4096)) {
  if (!xml_parse($xml_parser, $data, feof($fp))) {
    die(sprintf("XML error: %s at line %d",
      xml_error_string(xml_get_error_code($xml_parser)),
      xml_get_current_line_number($xml_parser)));
  }
}
xml_parser_free($xml_parser);

Upvotes: 1

Related Questions