Reputation: 13
I'm trying to get specific words out of an html and display them in a plain text edit for the moment(I will later add them into a table). Even though I managed to get the beginning of the word, I'm unable to get the end part. It shows all the content from the starting position. The html is something like this:
<span class="title">Some name here</span>
This is the code, I wrote.
int sTitle = html_code.indexOf("title\">") + 7;
int eTitle = html_code.indexOf("</span>");
int titLength = eTitle - sTitle;
QString title = html_code.mid(sTitle, titLength);
ui->searchBox->setPlainText(title);
And also there're a lot of /span and title tags in the html.Thank you!
Upvotes: 1
Views: 265
Reputation: 6776
Your code works perfectly if the following string is assigned to html_code
:
QString html_code = "<span class=\"title\">Some name here</span>";
However for more complex documents you may consider usage of heavy but powerful tool QtWebKit
and its QWebElement
class that provides access to tree structure of DOM elements of (X)HTML document. It will allow you to search only first specific tag (or more complex structures) or collection of all interesting entries, for example
#include <QWebPage>
#include <QWebFrame>
#include <QWebElement>
void MainWindow::some_handler()
{
QString html_code = "<span class=\"title\">Some name here</span>"
"<span class=\"title\">Some other name here</span>";
QWebPage page;
QWebFrame *frame = page.mainFrame();
frame->setHtml(html_code);
QWebElement document = frame->documentElement();
// one item
QWebElement title = document.findFirst("span.title");
QString text;
text += "First title span:\n\t" + title.toPlainText() + '\n';
// all items
QWebElementCollection title_collection = document.findAll("span.title");
text += "\nAll title spans:\n";
foreach (QWebElement elem, title_collection) {
text += '\t' + elem.toPlainText() + '\n';
}
ui->searchBox->setPlainText(text);
}
The following module should be added in the project file QT += webkitwidgets
to build the above code.
Note that the QWebPage
object works like a browser. It loads linked content and runs JavaScript. If it is not desired some other xml parsers may be considered, for example Qt XML module. This module is not actively supported, however it also provides API for tree structure of document elements via QDomDocument
, QDomElement
and QDomNodeList
classes. The code is not so nice as with QWebElement
, since here it is needed to loop over node list and manually check node type and its attribude "class", for example
QDomDocument document;
document.setContent(html_code);
QDomElement elem = document.documentElement();
QDomNodeList node_list = elem.elementsByTagName("span");
QString text;
for (int i = 0; i < node_list.length(); ++i) {
if (node_list.at(i).isElement() &&
node_list.at(i).toElement().attribute("class") == "title")
{
text += node_list.at(i).toElement().text() + '\n';
}
}
Upvotes: 1
Reputation: 314
try this:
int sTitle = html_code.indexOf("title\">") + 7;
int eTitle = html_code.indexOf("</span>");
QStringRef title(html_code, sTitle, eTitle);
ui->searchBox->setPlainText(title.toString());
Upvotes: 0