beterman
beterman

Reputation: 99

Fastest way to parse a XML file with libxml2?

Hi is there any "faster" way to parse a XML file with libxml2? Right now i do it that way following C++ Code:

void parse_element_names(xmlNode * a_node, int *calls)
{
    xmlNode *cur_node = NULL;

    for (cur_node = a_node; cur_node; cur_node = cur_node->next) {
      (*calls)++;
      if(xmlStrEqual(xmlCharStrdup("to"),cur_node->name)){
        //printf("node type: <%d>, name <%s>, content: <%s> \n", cur_node->children->type, cur_node->children->name, cur_node->children->content);
        //do something with the content
        parse_element_names(cur_node->children->children,calls);
        }
      else if(xmlStrEqual(xmlCharStrdup("from"),cur_node->name)) {
        //printf("node type: <%d>, name <%s>, content: <%s> \n", cur_node->children->type, cur_node->children->name, cur_node->children->content);
        //do something with the content
        parse_element_names(cur_node->children->children,calls);
        }
      else if(xmlStrEqual(xmlCharStrdup("note"),cur_node->name)) {
        //printf("node type: <%d>, name <%s>, content: <%s> \n", cur_node->children->type, cur_node->children->name, cur_node->children->content);
        //do something with the content
        parse_element_names(cur_node->children->children,calls);
        }
        .
        .
        .
        //about 100 more node names comming
      else{
        parse_element_names(cur_node->children,calls);
      }
    }

}
int main(int argc, char **argv)
{ 

    xmlDoc *doc = NULL;
    xmlNode *root_element = NULL;

    if (argc != 2)
        return(1);

    /*parse the file and get the DOM */
    doc = xmlReadFile(argv[1], NULL, XML_PARSE_NOBLANKS);

    if (doc == NULL) {
        printf("error: could not parse file %s\n", argv[1]);
    }
    int calls = 0;
    /*Get the root element node */
    root_element = xmlDocGetRootElement(doc);
    parse_element_names(root_element,&calls);

    /*free the document */
    xmlFreeDoc(doc);

    xmlCleanupParser();

    return 0;
}

Is it really the fastest way? Or is there any better/faster solution which you can advice me?

Thank you

Upvotes: 0

Views: 1588

Answers (1)

nwellnhof
nwellnhof

Reputation: 33618

xmlReadFile et al. are based on libxml2's SAX parser interface (actually, the SAX2 interface), so it's generally faster to use your own SAX parser if you don't need the resulting xmlDoc.

If you have to distinguish between many different element names like in your example, the fastest approach is usually to create separate functions for every type of node and use a hash table to lookup these functions.

Upvotes: 2

Related Questions