Reputation: 2110
Duplicate Thread Note
I already asked a similar question for this issue (Emoji data retrieved via ODBC Connection appears as question mark) but that was more looking at the setup of the classic ASP page, but I am now raising a question to ask for help with the ODBC drivers as I think the issue I have is related to that.
Environment Details
I connect to MySQL version 5.5.50-log (issue also replicated on 5.7.28-log, and on MySQL 8 as well) from classic ASP pages. All running 32 bit MySQL Server, using 32 bit ODBC Drivers.
I am running the code on a Windows 10 PC, but the live site is running on a Windows 2012 Server. The issue exists on both systems.
Issue
The issue is that when I connect to MySQL using an ODBC Driver and display the output, emoji data is displayed as a ?
rather than as the emoji even though I can see the emoji is saved correctly in the database. It's just when it's extracted via ODBC and displayed on the page, it is displayed as a ?
character.
MySQL Details
The MySQL Database is set up as follows:
Database charset: utf8mb4 Database collation: utf8mb4_general_ci The table and field:
Character Set: utf8mb4 Collation: utf8mb4_general_ci
The MySQL database is set correctly from what I can see, in terms of collation etc:
Variable_name Value
------------------------ --------------------
character_set_client utf8mb4
character_set_connection utf8mb4
character_set_database utf8mb4
character_set_filesystem binary
character_set_results utf8mb4
character_set_server latin1
character_set_system utf8
collation_connection utf8mb4_general_ci
collation_database utf8mb4_general_ci
collation_server latin1_swedish_ci
To test, I have data in the table, which contains this string: 🍔(T_T) é, è, à, ç
This is a screenshot of the data from the SQLyog:
Test Web Page
This is my test web page:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Test</title>
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css">
</head>
<body>
<p>Pasted directly from database field: 🍔(T_T) é, è, à, ç</p>
<p>Returned from SQL statement: ?(T_T) é, è, à, ç</p>
</body>
</html>
More notes on the issue
The issue is not that emoji data cannot be displayed on the web page, as the directly pasted content from MySQL appears fine.
The issue is that once the data is returned from an SQL Select from MySQL via the ODBC Driver, it does not render correctly.
Notes about the Classic ASP Page Setup
I have the following set right at the top of the ASP code:
<%@LANGUAGE="VBSCRIPT" CODEPAGE="65001"%>
<%
Response.ContentType = "text/html"
Response.AddHeader "Content-Type", "text/html;charset=UTF-8"
Response.CodePage = 65001
Response.CharSet = "UTF-8"
The ASP page itself is saved as a UTF-8 encoded file:
I know emoji data appears fine on the page as the pasted content from MySQL renders correctly.
ODBC Driver Notes
I think the issue is to do with the ODBC Connector, but I have tried everything I can think of and nothing works.
I have tried these drivers:
I have tried with Connection Strings, and DSN connections, the issue happens on all of them.
Sample connection string:
oConn.Open "DRIVER={MySQL ODBC 5.3 Unicode Driver}; port=33066; option=16387; server=localhost; database=db1083; user=cheese; password=string; option=3; charset=utf8mb4; stmt=set names 'utf8mb4';"
I have tried using DSN connections and setting the Connection properties with the correct charset enabled, and the Initial Statement set on older versions of the ODBC Driver which support using that.
I'm out of ideas now!
Could this be a bug with the ODBC drivers?
Upvotes: 2
Views: 830
Reputation: 2141
The latest version of the MySQL ODBC connector has fixed this bug!
The MySQL Connector/ODBC Driver 8.0.27 now correctly converts such characters (see https://dev.mysql.com/doc/relnotes/connector-odbc/en/news-8-0-27.html).
You will need to specify MySQL ODBC 8.0 Unicode Driver
as the ODBC driver.
Upvotes: 1
Reputation: 123474
As far as I know, the only way to retrieve Unicode supplementary characters (e.g., emoji) with ADODB and MySQL Connector/ODBC is to retrieve them as bytes — SELECT CAST(column_name AS BINARY)
— and then decode them. The following example is actually VBA, but tweaking it for VBScript is trivial.
Option Explicit
Sub demo_retrieve_utf8mb4_value()
Dim conn As New ADODB.Connection
conn.Open _
"DRIVER=MySQL ODBC 8.0 Unicode Driver;" & _
"SERVER=localhost;PORT=3307;" & _
"UID=root;PWD=(whatever);" & _
"DATABASE=mydb;" & _
"charset=utf8mb4;"
Dim rst As New ADODB.Recordset
rst.Open "SELECT CAST(emoji AS BINARY) FROM emoji_tbl WHERE id=1", conn
Dim s As Variant
s = decode_utf8(rst.Fields(0).Value)
'
' do stuff with the string value
End Sub
Private Function decode_utf8(field_value As Variant) As Variant
If IsNull(field_value) Then
decode_utf8 = Null
Else
Dim strm As New ADODB.Stream
With strm
.Type = adTypeBinary
.Open
.Write field_value
.Flush
.Position = 0
.Type = adTypeText
.Charset = "UTF-8"
decode_utf8 = .ReadText
.Close
End With
Set strm = Nothing
End If
End Function
Upvotes: 2