Thalia
Thalia

Reputation: 14615

Regular expressions returning empty strings

I need help making some string replacements, with regular expressions.

The task: scale the fonts in a generated html string. I am using Qt, must work in Qt 4.8.

I have determined some regular expression to separate the section containing the font sizes, and tested it (https://regex101.com/r/Y0W13N/1) - I don't know if it is correct or optimal, but the test site seems to give me the right output - yet I seem to get no matches in my code:

// get string between "<span style=\"" and "\">" (escaped quotes and backslashes)
QRegExp rx1("<span style=\"(?:=([^\\]]+))?(.*?);\">");
int pos = rx1.indexIn(text);
QStringList listSpans1 = rx1.capturedTexts();
qDebug() << listSpans1;                               // outputs ("", "", "") 

// get string between "<p style=\"" and "\">"
QRegExp rx2("<p style=\"(?:=([^\\]]+))?(.*?);\">");
pos = rx2.indexIn(text);
QStringList listSpans2 = rx2.capturedTexts();
qDebug() << listSpans2;                               // outputs ("", "", "") 

The text I am testing with is

"<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">
<html><head><meta name="qrichtext" content="1" /><style type="text/css">
p, li { white-space: pre-wrap; }
</style></head><body style=" font-family:'MS Shell Dlg 2'; font-weight:400; font-style:normal;">
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><span style=" font-family:'Some Font'; font-size:15pt; color:#000000;">Te</span><span style=" font-family:'Some Font'; font-size:9pt; color:#000000;">xt</span></p>
<p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; font-family:'Some Font'; font-size:9pt; color:#000000;"></p>
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><span style=" font-family:'Some Font'; font-size:9pt; color:#000000;"> B</span><span style=" font-family:'Some Font'; font-size:15pt; color:#000000;">ox</span></p>
<p style="-qt-paragraph-type:empty; margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px; font-family:'Some Font'; font-size:18pt; color:#000000;"></p></body></html>" 

I get empty strings from qDebug - I don't understand why given that the test site shows me correct strings, and that I seem to get matches ? Why empty....

(The next step is to separate the font portion... determine the font size... scale it... replace back... seems very complicated for such a simple operation but I could find no easier way)

The regular expressions I made seem to work in the test site, but they don't work in my code, I don't know why, obviously I have no experience with regex.

Please help get my regular expressions working... Thank you

Upvotes: 1

Views: 253

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627327

The point here is that you cannot use lazy *? / +? quantifiers in Qt RegExp.

You may solve the problem using rx1.setMinimal(true) and using .* pattern as Group 1 pattern:

QRegExp rx1("<span style=\"(.*);\">");
rx1.setMinimal(true);

Same with the second regex:

QRegExp rx2("<p style=\"(.*);\">");
rx2.setMinimal(true);

Upvotes: 1

Related Questions