Reputation: 1280
I've a problem and I can't understand what is causing it. I'm working on a legacy website, written in Classic ASP (oh god why me), and sometimes at apparently random time without any explanation the values from ADODB.Recordset are printed double encoded.
With double encoded I mean the "UTF-8 encoding of the ASCII representation of an UTF-8 multibyte string" so the "é" would look like "é" (with the exact same encoding).
The thing that is driving me crazy is that this appears to happen at random time, the 50% of the times they are encoded correctly, the other 50% they aren't.
Let me point out that it happens on the same page at different times, so after several page loads you could display them correctly, then broken, then again correctly and so on.
This thing happened 7 years ago in the early days of this website but a lot of water has passed under the bridge and only one of the guys that worked on this website in the first place still works in the company. He can't remember what did they do to solve the issue, he left me saying only that "the database connection encodings were saved into the session", and that perhaps can explain why there are so many Session.CodePage = 65001
around the page.
I even tried to force the charset to utf8
via query but clearly it didn't work.
The driver used is the olde MySQL ODBC 3.51 Driver
.
Thanks in advance for any advice or solution (get rid of Classic ASP unfortunately is not an option).
[UPDATE]
Here it is a plot twist, it breaks less times if I output the contents like this:
Session.CodePage = 1252
Response.Write(Property)
Session.CodePage = 65001
Actually I found this code almost everywhere in the website, as if the database driver didn't care at all about the connection's charset.
Upvotes: 2
Views: 1232
Reputation: 1280
I ran some tests, and thanks to @webaware's advices I convinced myself to update the ODBC Driver to version 5.1 and after some tuning the websites seemed to stabilize, that's the code I used:
Response.AddHeader "Content-Type", "text/html; charset=UTF-8"
Session.CodePage = 65001
Dim ConnString:ConnString = "driver={MySQL ODBC 5.1 Driver};server=localhost;port=3306;database=database;uid=uid;pwd=pwd"
Other combinations seem to break the output encoding, now it works out of the box.
I hope this can help for the future.
Upvotes: 1
Reputation: 1613
It can be really tricky to find the reason for this be behavior. But let me point out some facts about classic ASP that might help you...
Session.Codepage affects the whole duration the session, meaning all subsequent requests will use the specified codepage. That doesnt stop indivial asp-files to use another encoding though by again specifying another codepage. So look through your application for pages that specifies encodings either by Session.Codepage or Response.Codepage.
Things get really messy here. When form data is posted to the server there is no provision in the form url encoding standard to declare the code page used. Browser can be told what encoding to use and they will default to the charset of the html page contain the form, but there is no mechanism to communicate that choice to the server.
ASP takes the view that the codepage of posted form fields would be the same as the codepage of the response its about to send. Take a moment to absorb that.... This means that quite counter intuatively the Response.CodePage value has an impact on the strings returned by Request.Form. For this reason its important to get the correct codepage set early, doing some form processing and then setting the codepage later just before sending a response can lead to unexpected results.
When the script engine parses the file that chunks of content in the file (the stuff outside of script code blocks) are turned into a special form of Response.Write (including string literals). Its special in that at the point script execution would reach these special writes the processor simply copies verbatim the bytes as found in the file directly to the output stream, again no attempt is made to convert any encodings.
Read the answer to this question for more information. Internal string encoding, Classic ASP
Upvotes: 0