Avinash
Avinash

Reputation: 13267

XML search algorithm C++

I am trying to find out how to write XML search algorithm.

Following is the my File

    <DUMMYROOT>
<root>Job Started</root>
<root>Job Running</root>
</DUMMYROOT>

and I want search string as <root>Job Started</root> I should be able to supply inner level of nodes as a search string like

<DUMMYROOT><root1><root2><root3>STRINGTOSEARCH</root3></root2></root1></DUMMYROOT>

and my file may not the complete XML when I am applying my Search Algorithm.

Upvotes: 0

Views: 1702

Answers (2)

Jay
Jay

Reputation: 14481

If your file is incomplete then most xml parsers will choke when trying to read it. You might be better off just doing a string search against the file content.

Upvotes: 0

Jerry Coffin
Jerry Coffin

Reputation: 490623

Here's something I wrote a few years ago that seems to fit reasonably well with what you're looking for (though make no mistake, it is kind of ugly, and if the XML is really badly formed, it might run into a problem).

template <class OutIt>
void split(string const &input, string const &sep, OutIt output) {
    size_t start = 0;
    size_t pos;
    do { 
        pos = input.find(sep, start);
        std::string temp(input, start, pos-start);
        *output++ = temp;
        start = pos+1;
    } while (pos != string::npos);
}

string extract(string const &input, string const &field, bool whole=false) { 
    std::vector<std::string> names;
    split(field, "\\", std::back_inserter(names));

    size_t b = 0, e = string::npos;
    std::string ret(input);

    for (size_t i=0; i<names.size(); i++) {
        ret = std::string(ret, b, e-b);
        string sname  = "<" + names[i];
        string ename = "</" + names[i];
        if (whole) {
            sname+=">";
            ename+=">";
        }
        b = ret.find(sname);
        if (b==string::npos)
            return "";
        b = ret.find(">", b)+1;
        e = ret.find(ename, b);
        if (b==string::npos || e == string::npos)
            return "";
    }
    ret = std::string(ret, b, e-b);

    int pos;

    // minor cleanup: remove tabs from string before returning.        
    while ((pos=ret.find("\t"))!=std::string::npos)
        ret[pos] = ' ';

    return ret;
}

Normal use would be something like:

result = extract(input, "a\\b\\c\\d");

The "whole" parameter governs whether you've specified the "whole" tag, or whether it's allowed to have attributes in addition to what you've specified (e.g., <tag> vs. <tag attribute = "value">).

Upvotes: 1

Related Questions