ustulation
ustulation

Reputation: 3760

Write Unicode UTF-8 and UTF-16 data into a QByteArray

I'm developing a protocol for my company using Qt. It is required to query the Window Registry and write the value obtained into a prenegotiated socket. I have read the Registry Data into a QString. I must use 8-bit Unicode characters and for some 16-bit Unicode characters. I am using QByteArray to store all the data before writing to the socket using QTcpSocket::write(). Little Endianness must be followed.

  1. How do I get the data from QString into QByteArray in Unicode 8 bit format (specification says character type corresponds to quint8)?

  2. How do I get the data from QString into QByteArray in Unicode 16 bit format (specification says character type corresponds to quint16)?

  3. How can I maintain Little Endianness in all cases?

(I have no experience of dealing with Unicode/variable-byte-encoded data)

Upvotes: 2

Views: 3591

Answers (2)

jonathanzh
jonathanzh

Reputation: 1454

For your question # 2 - How to convert QString to UTF-16 QByteArray, there is a solution with QTextCodec::fromUnicode(), as shown in the following code example:

#include <QCoreApplication>
#include <QTextCodec>
#include <QDebug>

int main(int argc, char *argv[])
{
    QCoreApplication a(argc, argv);

    // QString to QByteArray
    // =====================

    QString qstr_test = "test"; // from QString
    qDebug().noquote().nospace() << "qstr_test[" << qstr_test << "]";     // Should see: qstr_test[test]

    QTextCodec * pTextCodec = QTextCodec::codecForName("UTF-16");

    QByteArray qba_test = pTextCodec->fromUnicode(qstr_test); // to UTF-16 QByteArray
    qDebug() << "qba_test[";
    int test_size = qba_test.size();
    for(int i = 0; i < test_size; ++i) { // Should see each UTF-16 encoded character per line like this: ÿþt e s t
        qDebug() << qba_test.at(i);
    }
    qDebug() << "]";

    return a.exec();
}

The above code has been tested using Qt 5.4.

Upvotes: 2

Samuel Harmer
Samuel Harmer

Reputation: 4412

Looking in the documentation is a good start.

  1. QString::toUtf8()
  2. Create a new QByteArray starting with a BOM. Use QString::utf16() to get the ushort values, mask out the top and bottom halves in the endianness you want into the QByteArray.
  3. UTF-8 doesn't have/need endianness. The ushort masks needed to put 16-bit values into an 8-bit array for little endianness would be 0x00FF then 0xFF00.
  4. Read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) at home tonight.

Upvotes: 5

Related Questions