damianb
damianb

Reputation: 1224

Get QString length (in characters, not bytes)

I need to get the actual character count (not byte count), similar to what is output when getting string length in V8.

This is necessary for use with Twitter, which goes by character count no matter the language used, even with UTF-8 (it does NOT go by byte length).

Ex:

in chrome/chromium js console, or in nodejs:

> "Schöne Grüße".length
< 12

In Qt 4.8.2, trying QString someStr = "Schöne Grüße"; cout << someStr.length() will output 15, which is not what I'm aiming for.

Upvotes: 4

Views: 9891

Answers (2)

Ruslan
Ruslan

Reputation: 19140

If you really want to count grapheme clusters (i.e. the user-perceived characters) instead of code units, you need QTextBoundaryFinder. Here's an example of use:

#include <iostream>
#include <QTextBoundaryFinder>
#include <QString>

int main()
{
    const QString s=QString::fromUtf8(u8"abc\U00010139def\U00010102g");
    std::cout << "String: \"" << s.toStdString() << "\"\n";
    std::cout << "Code unit count       : " << s.length() << "\n";

    QTextBoundaryFinder tbf(QTextBoundaryFinder::Grapheme, s);
    int count=0;
    while(tbf.toNextBoundary()!=-1)
        ++count;
    std::cout << "Grapheme cluster count: " << count << "\n";
}

Output:

String: "abc𐄹def𐄂g"
Code unit count       : 11
Grapheme cluster count: 9

Upvotes: 5

jdi
jdi

Reputation: 92627

I believe you need to construct it using the specific fromUtf8 static method:

QString s = QString::fromUtf8("Schöne Grüße");

Upvotes: 0

Related Questions