Reputation: 11
I am reading an XML file into a stringstream buffer in order to parse it using RapidXML. RapidXML is only parsing the names of the XML nodes, but none of their attribute names or values. After some experimentation, I discovered that the problem is not likely to be with RapidXML, but with conversion of the stringstream buffer to a string using std::string content(buffer.str());. The '=' characters that are so important to XML parsing are converted to ' ' (space characters), prior to any RapidXML processing.
The character replacement is evident in the console window when the cout << calls are made in the code below, which is before RapidXML gets its hands on the string.
My code is as follows:
#include <iostream>
#include <fstream>
#include <stdio.h>
#include <conio.h>
#include <string>
#include <stdlib.h>
#include <rapidxml.hpp>
#include <vector>
#include <sstream>
using namespace std;
using namespace rapidxml;
//... main() and so forth, all works fine...
ifstream file(names.at(i)); // names.at(i) works fine...
//...
file.read(fileData, fileSize); // works fine...
//...
// Create XML document object using RapidXML:
xml_document<> doc;
//...
std::stringstream buffer;
buffer << file.rdbuf();
// This is where everything looks okay (i.e., '=' shows up properly):
cout << "\n" << buffer.str() << "\n\nPress a key to continue...";
getchar();
file.close();
std::string content(buffer.str());
// This is where the '=' are replaced by ' ' (space characters):
cout << "\n" << content << "\n\nPress a key to continue...";
getchar();
// Parse XML:
doc.parse<0>(&content[0]);
// Presumably the lack of '=' is preventing RapidXML from parsing attribute
// names and values, which always follow '='...
Thanks in advance for your help.
p.s. I followed advice on using this technique for reading an entire XML file into a stringstream, converting it to a string, and then feeding the string to RapidXML from the following links (thanks to contributors of these pieces of advice, sorry I can't make them work yet...):
Automation Software's RapidXML mini-tutorial
...this method was seen many other places, I won't list them here. Seems sensible enough. My errors seem to be unique. Could this be an ASCII vs. UNICODE issue?
I also tried code from here:
Thomas Whitton's example converting a string buffer to a dynamic cstring
code snippet from the above:
// string to dynamic cstring
std::vector<char> stringCopy(xml.length(), '\0');
std::copy(xml.begin(), xml.end(), stringCopy.begin());
char *cstr = &stringCopy[0];
rapidxml::xml_document<> parsedFromFile;
parsedFromFile.parse<0>(cstr);
...with similar RapidXML failure to parse node attribute names and values. Note that I didn't dump the character vector stringCopy to the console to inspect it, but I am getting the same problem, which for review is:
Upvotes: 0
Views: 446
Reputation: 65126
If you look closely the =
characters probably aren't being replaced by spaces, but zero bytes. If you look at the rapidxml documentation here:
http://rapidxml.sourceforge.net/manual.html#namespacerapidxml_1differences
It specifically states that it modifies the source text. This way it can avoid allocating any new strings, instead it uses pointers to the original source.
This part seems to work correctly, maybe the problem is with the rest of your code that's trying to read the attributes?
Upvotes: 1