user226722
user226722

Reputation:

asp.net character encoding problem utf8

I'm storing some html-encoded data in a sql server database and I've written a script to output the data in a csv format minus the html tags and I'm getting a weird issue when html-decoding the remaining data. For example the data contains a quote character (which is html-encoded as ’), but when I try to html-decode it the data comes out as a series of weird characters (’). Does anyone know how to solve this issue? The output encoding of the page is UTF-8 if that helps.

Any advice would be much appreciated!

Cheers

Tim

Upvotes: 0

Views: 3772

Answers (2)

dkarp
dkarp

Reputation: 14763

Those 3 weird characters are how UTF-8 encodes the HTML entity ’. (They're actually the octets 0xE2 0x80 0x99, and those bytes render as "’" in your computer's default charset windows-1252.) So I don't think you've got an issue with your encoding.

It's evidently a known problem that Excel 2000 has problems with .csv files in UTF-8 encoding. The solution, bizarrely enough, is to switch the filename extension to .txt, at which point Excel 2000 will evidently import the file correctly.

Upvotes: 3

Tor Andersson
Tor Andersson

Reputation: 109

If the data is read from the CSV files, open the csv file in notepad press Save As in the fiile menu, save the file as Encoding-UTF8.

Upvotes: 0

Related Questions