tech_learner
tech_learner

Reputation: 725

Read a line from xml file using C++

My XML File has:

< Package > xmlMetadata < /Package >

I am searching for a tag in this file and the text between the starting and closing tags of this has to be printed on console. i.e. in this case I want xmlMetadata to be printed on the console. Similarly it should go further in the file and print again if it encounters another < Package > tag in the same file.

Here is my code but it is printing the contents of the whole file:

{
    string line="< Package >";
    ifstream myfile (xmlFileName); //xmlFileName is xml file in which search is to done
    if (myfile.is_open())
    {
    while ( myfile.good() )
    {
      getline (myfile,line);
      std::cout<<line<< endl;
    }
    myfile.close();
    }
    else cout << "Unable to open file"; 
}

Displaying below my whole xml:

< ? xml version="1.0" ? >
< fileStructure >
< Main_Package >
   File_Navigate
< /Main_Package >
< Dependency_Details >

< Dependency >
   < Package >
      xmlMetadata
   < /Package >
   < Header >
      xmlMetadata.h
   < /Header >
   < Header_path >
      C:\Dependency\xmlMetadata\xmlMetadata.h
   < /Header_path >
   < Implementation >
      xmlMetadata.cpp
   < /Implementation >
   < Implementation_path >
      C:\Dependency\xmlMetadata\xmlMetadata.cpp
   < /Implementation_path >
< /Dependency >

< Dependency >
   < Package >
      xmlMetadata1
   < /Package >
   < Header >
      xmlMetadata1.h
   < /Header >
   < Header_path >
      C:\Dependency\xmlMetadata\xmlMetadata1.h
   < /Header_path >
   < Implementation >
      xmlMetadata1.cpp
   < /Implementation >
   < Implementation_path >
      C:\Dependency\xmlMetadata\xmlMetadata1.cpp
   < /Implementation_path >
< /Dependency >

< /Dependency_Details >
< /fileStructure >

Upvotes: 2

Views: 37835

Answers (3)

karlphillip
karlphillip

Reputation: 93410

This is not the way you should parse an XML file, but since you don't want to use a parser library this code might get you started.

File: demo.xml

<? xml version="1.0" ?>
<fileStructure>
<Main_Package>
   File_Navigate
</Main_Package>
<Dependency_Details>

<Dependency>
   <Package>
      xmlMetadata
   </Package>
   <Header>
      xmlMetadata.h
   </Header>
   <Header_path>
      C:\Dependency\xmlMetadata\xmlMetadata.h
   </Header_path>
   <Implementation>
      xmlMetadata.cpp
   </Implementation>
   <Implementation_path>
      C:\Dependency\xmlMetadata\xmlMetadata.cpp
   </Implementation_path>
</Dependency>

<Dependency>
   <Package>
      xmlMetadata1
   </Package>
   <Header>
      xmlMetadata1.h
   </Header>
   <Header_path>
      C:\Dependency\xmlMetadata\xmlMetadata1.h
   </Header_path>
   <Implementation>
      xmlMetadata1.cpp
   </Implementation>
   <Implementation_path>
      C:\Dependency\xmlMetadata\xmlMetadata1.cpp
   </Implementation_path>
</Dependency>

</Dependency_Details>
</fileStructure>

The basic idea of the code is while you are reading each line of the file, strip the white spaces that are in the beginning and store the new-stripped-string into tmp, and then try to match it to one of the tags you are looking for. Once you find the begin-tag, keep printing the following lines until the close-tag is found.

File: parse.cpp

#include <iostream>
#include <string>
#include <fstream>

using namespace std;

int main()
{
    string line;
    ifstream in("demo.xml");

    bool begin_tag = false;
    while (getline(in,line))
    {
        std::string tmp; // strip whitespaces from the beginning
        for (int i = 0; i < line.length(); i++)
        {
            if (line[i] == ' ' && tmp.size() == 0)
            {
            }
            else
            {
                tmp += line[i];
            }
        }

        //cout << "-->" << tmp << "<--" << endl;

        if (tmp == "<Package>")
        {
            //cout << "Found <Package>" << endl;
            begin_tag = true;
            continue;
        }
        else if (tmp == "</Package>")
        {
            begin_tag = false;
            //cout << "Found </Package>" << endl;
        }

        if (begin_tag)
        {
            cout << tmp << endl;
        }
    }
}

Outputs:

xmlMetadata
xmlMetadata1

Upvotes: 4

Martin Beckett
Martin Beckett

Reputation: 96109

Getline doesn't search for a line it simply reads each line into the variable "line", you then have to search in that "line" for the text you want.

   size_t found=line.find("Package");
   if (found!=std::string::npos) {
       cout << line;

BUT this is a bad way to handle XML - there is nothing stopping the XML writer from breaking the tag onto multiple lines. Unless this is a one off and you create the file you really should use a general XML parser to read the file and give you a list of tags.

There are a bunch of very easy to use XML parsers, such as TinyXML

EDIT (different xml now posted) - that's the problem with using regex to parse xml, you don't know how the xml will break lines. You can keep adding more and more layers of complexity until you have written your own xml parser - just use one of What is the best open XML parser for C++?

Upvotes: 6

karlphillip
karlphillip

Reputation: 93410

A single line of tags on a file can hardly be described as XML. Anyway, if you really want to parse a XML file, this could be accomplished so much easier using a parser library like RapidXML. This page is an excellent resource.

The code below is my attempt to read the following XML (yes, a XML file must have a header):

File: demo.xml

<?xml version="1.0" encoding="utf-8"?>
<rootnode version="1.0" type="example">
    <Package> xmlMetadata </Package>
</rootnode>

A quick note: rapidxml is consisted only of headers. On my system I unzipped the library to /usr/include/rapidxml-1.13, so the code below could be compiled with:

g++ read_tag.cpp -o read_tag -I/usr/include/rapidxml-1.13/

File: read_tag.cpp

#include <iostream>
#include <string>
#include <vector>
#include <fstream>
#include <rapidxml.hpp>

using namespace std;
using namespace rapidxml;


int main()
{
    string input_xml;
    string line;
    ifstream in("demo.xml");

    // read file into input_xml
    while(getline(in,line))
        input_xml += line;

    // make a safe-to-modify copy of input_xml
    // (you should never modify the contents of an std::string directly)
    vector<char> xml_copy(input_xml.begin(), input_xml.end());
    xml_copy.push_back('\0');

    // only use xml_copy from here on!
    xml_document<> doc;
    // we are choosing to parse the XML declaration
    // parse_no_data_nodes prevents RapidXML from using the somewhat surprising
    // behavior of having both values and data nodes, and having data nodes take
    // precedence over values when printing
    // >>> note that this will skip parsing of CDATA nodes <<<
    doc.parse<parse_declaration_node | parse_no_data_nodes>(&xml_copy[0]);

    // alternatively, use one of the two commented lines below to parse CDATA nodes,
    // but please note the above caveat about surprising interactions between
    // values and data nodes (also read http://www.ffuts.org/blog/a-rapidxml-gotcha/)
    // if you use one of these two declarations try to use data nodes exclusively and
    // avoid using value()
    //doc.parse<parse_declaration_node>(&xml_copy[0]); // just get the XML declaration
    //doc.parse<parse_full>(&xml_copy[0]); // parses everything (slowest)

    // since we have parsed the XML declaration, it is the first node
    // (otherwise the first node would be our root node)
    string encoding = doc.first_node()->first_attribute("encoding")->value();
    // encoding == "utf-8"

    // we didn't keep track of our previous traversal, so let's start again
    // we can match nodes by name, skipping the xml declaration entirely
    xml_node<>* cur_node = doc.first_node("rootnode");
    string rootnode_type = cur_node->first_attribute("type")->value();
    // rootnode_type == "example"

    // go straight to the first Package node
    cur_node = cur_node->first_node("Package");
    string content = cur_node->value(); // if the node doesn't exist, this line will crash

    cout << content << endl;
}

Outputs:

xmlMetadata

Upvotes: 1

Related Questions