4532066
4532066

Reputation: 2110

Windows ODBC driver - retrieved emoji data rendered as?

Duplicate Thread Note

I already asked a similar question for this issue (Emoji data retrieved via ODBC Connection appears as question mark) but that was more looking at the setup of the classic ASP page, but I am now raising a question to ask for help with the ODBC drivers as I think the issue I have is related to that.

Environment Details

I connect to MySQL version 5.5.50-log (issue also replicated on 5.7.28-log, and on MySQL 8 as well) from classic ASP pages. All running 32 bit MySQL Server, using 32 bit ODBC Drivers.

I am running the code on a Windows 10 PC, but the live site is running on a Windows 2012 Server. The issue exists on both systems.

Issue

The issue is that when I connect to MySQL using an ODBC Driver and display the output, emoji data is displayed as a ? rather than as the emoji even though I can see the emoji is saved correctly in the database. It's just when it's extracted via ODBC and displayed on the page, it is displayed as a ? character.

MySQL Details

The MySQL Database is set up as follows:

Database charset: utf8mb4 Database collation: utf8mb4_general_ci The table and field:

Character Set: utf8mb4 Collation: utf8mb4_general_ci

The MySQL database is set correctly from what I can see, in terms of collation etc:

Variable_name             Value               
------------------------  --------------------
character_set_client      utf8mb4             
character_set_connection  utf8mb4             
character_set_database    utf8mb4             
character_set_filesystem  binary              
character_set_results     utf8mb4             
character_set_server      latin1              
character_set_system      utf8                
collation_connection      utf8mb4_general_ci  
collation_database        utf8mb4_general_ci  
collation_server          latin1_swedish_ci   

To test, I have data in the table, which contains this string: 🍔(T_T) é, è, à, ç

This is a screenshot of the data from the SQLyog:

enter image description here

Test Web Page

This is my test web page:

<!DOCTYPE html>

<html lang="en">
    <head>
        <meta charset="utf-8">
        <meta http-equiv="X-UA-Compatible" content="IE=edge">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <title>Test</title>
        <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css">
    </head>
    <body>
    <p>Pasted directly from database field: 🍔(T_T) é, è, à, ç</p>
    <p>Returned from SQL statement: ?(T_T) é, è, à, ç</p>
    </body>
</html>

More notes on the issue

The issue is not that emoji data cannot be displayed on the web page, as the directly pasted content from MySQL appears fine.

The issue is that once the data is returned from an SQL Select from MySQL via the ODBC Driver, it does not render correctly.

Notes about the Classic ASP Page Setup

I have the following set right at the top of the ASP code:

<%@LANGUAGE="VBSCRIPT" CODEPAGE="65001"%>
<%
Response.ContentType = "text/html"
Response.AddHeader "Content-Type", "text/html;charset=UTF-8"
Response.CodePage = 65001
Response.CharSet = "UTF-8"

The ASP page itself is saved as a UTF-8 encoded file:

enter image description here

I know emoji data appears fine on the page as the pasted content from MySQL renders correctly.

ODBC Driver Notes

I think the issue is to do with the ODBC Connector, but I have tried everything I can think of and nothing works.

I have tried these drivers:

I have tried with Connection Strings, and DSN connections, the issue happens on all of them.

Sample connection string:

oConn.Open "DRIVER={MySQL ODBC 5.3 Unicode Driver}; port=33066; option=16387; server=localhost; database=db1083; user=cheese; password=string; option=3; charset=utf8mb4; stmt=set names 'utf8mb4';"

I have tried using DSN connections and setting the Connection properties with the correct charset enabled, and the Initial Statement set on older versions of the ODBC Driver which support using that.

I'm out of ideas now!

Could this be a bug with the ODBC drivers?

Upvotes: 2

Views: 830

Answers (2)

Toothbrush
Toothbrush

Reputation: 2141

The latest version of the MySQL ODBC connector has fixed this bug!

The MySQL Connector/ODBC Driver 8.0.27 now correctly converts such characters (see https://dev.mysql.com/doc/relnotes/connector-odbc/en/news-8-0-27.html).

You will need to specify MySQL ODBC 8.0 Unicode Driver as the ODBC driver.

Upvotes: 1

Gord Thompson
Gord Thompson

Reputation: 123474

As far as I know, the only way to retrieve Unicode supplementary characters (e.g., emoji) with ADODB and MySQL Connector/ODBC is to retrieve them as bytes — SELECT CAST(column_name AS BINARY) — and then decode them. The following example is actually VBA, but tweaking it for VBScript is trivial.

Option Explicit

Sub demo_retrieve_utf8mb4_value()
    Dim conn As New ADODB.Connection
    conn.Open _
        "DRIVER=MySQL ODBC 8.0 Unicode Driver;" & _
        "SERVER=localhost;PORT=3307;" & _
        "UID=root;PWD=(whatever);" & _
        "DATABASE=mydb;" & _
        "charset=utf8mb4;"
    Dim rst As New ADODB.Recordset
    rst.Open "SELECT CAST(emoji AS BINARY) FROM emoji_tbl WHERE id=1", conn
    Dim s As Variant
    s = decode_utf8(rst.Fields(0).Value)
    '
    ' do stuff with the string value

End Sub

Private Function decode_utf8(field_value As Variant) As Variant
    If IsNull(field_value) Then
        decode_utf8 = Null
    Else
        Dim strm As New ADODB.Stream
        With strm
            .Type = adTypeBinary
            .Open
            .Write field_value
            .Flush
            .Position = 0
            .Type = adTypeText
            .Charset = "UTF-8"
            decode_utf8 = .ReadText
            .Close
        End With
        Set strm = Nothing
    End If
End Function

Upvotes: 2

Related Questions