Reputation: 2444
I've just started switching to QRegularExpression, and I'm using it to tokenize a string with multiple delimiter possibilities. I've encountered a surprising behavior, which seems to me to be a bug. I'm using Qt 5.5.1 on Windows.
Here's sample code:
#include <QRegularExpression>
#include <QString>
#include <QtDebug>
int main(int argc, char *argv[])
{
Q_UNUSED (argc);
Q_UNUSED (argv);
QRegularExpression regex ("^ ");
qDebug () << "Expected: " << QString ("M 100").indexOf(regex);
qDebug () << "NOT expected:" << QString ("M 100").indexOf(regex, 1);
qDebug () << "Expected: " << QString (" 100").indexOf(regex);
QRegularExpression regex1 (" ");
qDebug () << "Expected: " << QString ("M 100").indexOf(regex1);
}
And the output:
Expected: -1
NOT expected: -1
Expected: 0
Expected: 1
The use of the caret (^) when used with a starting position other than 0 in the "indexOf" call is preventing the expression from matching. Intuitively, I expected that the caret matches the string at the position that I specified. Instead, it simply never matches.
I'm going to switch my tokenizing to use splitRref to avoid this problem. While that's probably slightly cleaner anyway, I need to understand whether this is correct behavior or if I should be reporting a bug to Qt.
UPDATE: Using splitRef doesn't entirely solve my problem because I need to use a regular expression to detect if some tokens are floating point numbers, and I can't use a QRegularExpression with QStringRef. For that possibility, I have to convert my QStringRef token into an actual QString, which was what I was trying to avoid in the first place.
Upvotes: 0
Views: 447
Reputation: 22816
^
matches at the beginning of the subject string, or after a newline when in multiline mode. The offset does not alter these semantics. Hence, matching /^ /
(in regex notation) against M 100
at offset 1 correctly results in no match.
Perhaps you want \G
? From pcrepattern(3)
:
\G
matches at the first matching position in the subjectThe
\G
assertion is true only when the current matching position is at the start point of the match, as specified by the startoffset argument ofpcre_exec()
. It differs from\A
when the value of startoffset is non-zero.
With that, this code:
QRegularExpression regex ("\\G ");
qDebug () << "Expected: " << QString ("M 100").indexOf(regex);
qDebug () << "NOT expected:" << QString ("M 100").indexOf(regex, 1);
qDebug () << "Expected: " << QString (" 100").indexOf(regex);
prints
Expected: -1
NOT expected: 1
Expected: 0
Upvotes: 1