Vladimir Bershov
Vladimir Bershov

Reputation: 2832

How Utf-8 may not work in Qt 5?

I have a primitive test: the project in msvc2015 and Qt5.9.3. The file main.cpp is saved in Unicode as UTF-8 with signature:

enter image description here

I try to show the message box which should show some text on Russian language. The whole code:

#include <QtWidgets/QApplication>
#include <QMessageBox>

int main(int argc, char *argv[])
{
    QApplication a(argc, argv);

    QString ttl = QString::fromUtf8("russian_word_1");
    QString txt = QString::fromUtf8("russian_word_2");

    QMessageBox::information(nullptr, ttl, txt);

    return a.exec();
}

And what I receive:

enter image description here

How may this be possible?


Update 1: I want to use UTF-8 exactly with BOM according to the Stackoverflow author's statement:

...It does not make sense to have a string without knowing what encoding it uses ���


Update 2: In this particular case, most likely it is a bug in the compiler.

Upvotes: 3

Views: 4033

Answers (4)

cangyin
cangyin

Reputation: 63

If your using Qt Creator + MSVC compiler, this may help you.

TLDR:

  1. save all of your source files as UTF-8 without BOM
  2. add this line in your .pro file: QMAKE_CXXFLAGS += /utf-8

Done!

refs:

  1. MSVC compiler flag to Set Source and Executable character sets to UTF-8
  2. Add compiler flag in Qt Creator

Upvotes: 2

Mohammad Kanan
Mohammad Kanan

Reputation: 4582

Use QByteArray for your message text, then get it as unicode QString for display:

 int main(int argc, char *argv[])
{
    QApplication a(argc, argv);
    QTextCodec *codec1 = QTextCodec::codecForName("CP1256");
    // Converted Text:
    QByteArray myLanguage = "لا لا لا لا لا لا لا ";
    QString myLanguage2unicode = codec1->toUnicode(myLanguage);
    // Non converted text:
    QString txt = QString::fromUtf8("لا لا لا لا لا لا لا  ");

      QMessageBox::information(nullptr, myLanguage2unicode, txt);

    return a.exec();
}

Result of above code:

enter image description here

Upvotes: 0

Nikos C.
Nikos C.

Reputation: 51840

If the compiler produces garbage strings for UTF-8 source files that have a BOM, then it's a bug in the compiler. However, the use of a BOM with UTF-8 is not recommended in the first place. You shouldn't use it unless you actually have a reason to.

Furthermore, you don't need to do explicit fromUtf8() conversions. You can just do:

QString ttl = "russian_word_1";
QString txt = "russian_word_2";

QString assumes string literals are UTF-8. From the documentation:

In all of the QString functions that take const char * parameters, the const char * is interpreted as a classic C-style '\0'-terminated string encoded in UTF-8.

You may use QStringLiteral to wrap string literals as an optimization, but this is not required.

Lastly, you can use tr() to wrap the string literals if you at some point want to translate the application from Russian to other languages. It is generally a good idea to use tr() in case you later decide to do a translation.

Note that having non-English strings in source code is generally fine. It's what UTF-8 (and Unicode in general) is there for. All modern compilers support it. What most people frown upon however, is non-English code:

auto индекс = 0; // Please don't.

But non-English strings are fine.

Upvotes: 1

Dmitry Sazonov
Dmitry Sazonov

Reputation: 8994

Don't use non-english ASCII in your code. Because compilation depends on compiler, source file encoding etc. Write only english text, wrapped in tr(""). Create translation files, load them. Read about internalization in qt.

Another usefull link.

Upvotes: 2

Related Questions