Qt : QXmlQuery and XPaths

Question

I'm here to ask you some help with QXmlQuery and Xpath. I'm trying to use this combination to extract some data from several HTML documents. These documents are downloaded and then cleaned with the HTML Tidy Library.

The problem is when I try my XPath. Here is an example code :

[...]
    
        Hauteur : 1127 mm
        Largeur : 640 mm
        Profondeur : 685 mm
        Poids : 159.6 kg
[...]

The clean code is stored in a QString "code" :

QStringList fields, values;
QXmlQuery query;

query.setFocus(code);
query.setQuery("//*[@id=\"idTab2\"]/*/*/string()");
query.evaluateTo(&fields);

My goal is to get all the fields (Hauteur, Largeur, Profondeur, Poids, etc.) and their value (1127 mm, 640 mm, 685 mm, 159.6 kg, etc.).

Question 1

As you can see, I use this XPath //*[@id="idTab2"]/*/*/string() to recover the fields because this : //ul[@id="idTab2"]/li/span/string() doesn't work. When I try to specify a tag name, it gives me nothing. It only works with *. Why ? I've checked the code returned by the tidy function and the XPath is not altered. So, I don't see any prolem. Is this normal ? Or maybe there is something I don't know...

Question 2

In the previous XHTML code, the li tags wrap a span tag and some text. I don't know how to get only the text and not the content of the span tag. I tried :

//*[@id="idTab2"]/*/string() gives : Hauteur : 1127 mm Largeur : 640 mm Profondeur : 685 mm

//*[@id="idTab2"]/*[2]/string() gives : Nothing

So, if I'm not wrong, the text in the li tag is not considered as a child node but it should be. See the accepted answer : Select just text directly in node, not in child nodes.

Thanks for reading, I hope someone can help me.

Qt : QXmlQuery and XPaths

Answers (1)

Related Questions