kfytdy
kfytdy

Reputation: 21

Php qr code generator works strange with utf-8 phrase

I downloaded the library http://phpqrcode.sourceforge.net/ and wrote simplest code for it

include('./phpqrcode/qrlib.php');
QRcode::png('иванов иван иванович 11111');

But resulted qr code contains only half of string

Resulted qr code - 'иванов иван ив';

url - vologda-oblast.ru/coronavirus/qr/parampng.php

What can be wrong?

Upvotes: 2

Views: 2869

Answers (1)

Maxim Masiutin
Maxim Masiutin

Reputation: 4782

The "phpqrcode" library in your case encodes a number of characters instead of the number of bytes of a UTF-8 string. That’s why the string is truncated. If you QR-encode English-only text, the string will not be truncated. The truncation occurs only with Cyrillic characters since it takes 2 bytes to encode each Cyrillic character in UTF-8 rather than just a single byte for a Latin one.

Interestingly, the demo example of the library on the author’s page do encode Cyrillic characters correctly.

The truncation happens in your case because you are using the following options in your php.ini file:

mbstring.func_overload = 2
mbstring.internal_encoding = "UTF-8"

If you remove the mbstring.func_overload (deprecated since PHP 7.2.0) from php.ini or set it 0, the "phpqrcode" library will start working properly. Otherwise, the strlen() function used by the library will return number of characters rather than the number of bytes in a UTF8-ecoded octet string, while str_split(), another function used by the library, will always return the number of bytes since it is not affected by mbstring.func_overload. As a result, your QR-codes will contain truncated strings.

Since you are using the Bitrix Site Manager CMS, removing the mbstring.func_overload from php.ini may be problematic until you fully update Bitrix to 20.5.393 (released on September 2020) or later version. Earlier version did rely on this deprecated feature. You can find find more information about Bitrix reliance on this deprecated feature at https://idea.1c-bitrix.ru/remove-dependency-on-mbstring-settingsfuncoverload/ or https://idea.1c-bitrix.ru/?tag=4799

Since you cannot change php.ini configuration on run-time, you can try to configure your web server to have php options configure on a per-directory level. Failing that, you can fix the code of the "phpqrcode" library to work correctly, at least partially, in your case, to not rely on the strlen() function. To to that, edit the qrencode.php file the following way. First, replace the $eightbit constant of the QREncode class from false to true. Second, in the function encodeString8bit, replace

        $ret = $input->append(QR_MODE_8, strlen($string), str_split($string));

to

        $arr = str_split($string);
        $len = count($arr);
        $ret = $input->append(QR_MODE_8, $len, $arr);

Anyway, since the "phpqrcode" library does not currently support Extended Channel Interpretations (ECI) mode, you cannot reliably encode Cyrillic characters with the library. It uses the 8-bit string mode of storing text in a QR code, which by default may only contain ISO-8859-1 (Latin-1) characters unless the default character set is modified by a ECI entry. But the library cannot insert the ECI entry into a QR code to show that the text has UTF-8 encoding rather than ISO-8859-1. Some decoding applications will auto-detect the wrong charset and show the string correctly, while some (compliant) may not.

As a conclusion, since the "phpqrcode" does not currently support ECI, you cannot reliably encode Cyrillic characters with it, but you can at least make it not truncate the string as I have shown above.

Upvotes: 2

Related Questions