Reputation: 3803
I'm trying to bring in a csv for some javascript to munch on and spit out on an html page. The csv has some special characters like ½ and ×. According to Firebug, when I put a breakpoint inside the callback of $.get(), it looks like already there the special characters are missing. They are replaced with some sort of whitespace that displays as a question mark or box if I copy and past it into another program.
I have tried
$.ajaxSetup({
dataType: "text" ,
contentType: "text/plain; charset=utf-8"
});
and other variations. The doctype of my webpage is utf-8. I have also tried 8859-1. Nothing so far has worked.
EDIT: placing the characters by hand into the html either as is or using html entity codes works fine. Placing them with javascript works too. The only problem is reading this CSV file.
EDIT2: Try this. Create a text file with this in it Öç¼»
. Then create a webpage like so...
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jquery/1.4.4/jquery.min.js"></script>
<script type="text/javascript">
$.get("encodeme.txt", function(data){
console.log(data);
})
</script>
</head>
<body>
</body>
</html>
All that is logged is a whitespace and Chinese character: �缻
. Notice that the whitespace appears as a qestion mark thingy when I copypaste it.
Upvotes: 1
Views: 4755
Reputation:
This is classic character encoding (I think). I never rely on anything more than alphanumeric characters to display. Anything else I escape. Even if your CSV comes back with the proper characters they still might get mangled once you print them to the DOM (I had a very nasty experience regarding French accented characters and properties files which took forever to fix, so I no longer take chances with exotic characters.).
Any characters in your HTML apart from A-Z, numbers, and basic punctuation should be escaped:
é makes é
— makes —
Upvotes: 0
Reputation: 3803
Blah! I should have seen this sooner. The problem was that the csv file was encoded as ANSI. I did briefly look at the file in Notepad++ and should have noticed the problem there but I foolishly missed it the first time. I selected Format > Convert to UTF-8 in Notepad++ and now it works fine. So Marc B was closest to answering the question, although he didn't post it as an answer for some reason. Now, how to get OpenOffice to encode my files correctly...
Upvotes: 1
Reputation: 1518
How about this
$.ajaxSetup({
dataType: "text" ,
scriptCharset: "utf-8" ,
contentType: "application/json; charset=utf-8"
});
I found this function in here
function char_convert(){
var chars = ["©","Û","®","ž","Ü","Ÿ","Ý","$","Þ","%","¡","ß","¢","à","£","á","À","¤","â","Á","¥","ã","Â","¦","ä","Ã","§","å","Ä","¨","æ","Å","©","ç","Æ","ª","è","Ç","«","é","È","¬","ê","É","","ë","Ê","®","ì","Ë","¯","í","Ì","°","î","Í","±","ï","Î","²","ð","Ï","³","ñ","Ð","´","ò","Ñ","µ","ó","Õ","¶","ô","Ö","·","õ","Ø","¸","ö","Ù","¹","÷","Ú","º","ø","Û","»","ù","Ü","@","¼","ú","Ý","½","û","Þ","€","¾","ü","ß","¿","ý","à","‚","À","þ","á","ƒ","Á","ÿ","å","„","Â","æ","…","Ã","ç","†","Ä","è","‡","Å","é","ˆ","Æ","ê","‰","Ç","ë","Š","È","ì","‹","É","í","Œ","Ê","î","Ë","ï","Ž","Ì","ð","Í","ñ","Î","ò","‘","Ï","ó","’","Ð","ô","“","Ñ","õ","”","Ò","ö","•","Ó","ø","–","Ô","ù","—","Õ","ú","˜","Ö","û","™","×","ý","š","Ø","þ","›","Ù","ÿ","œ","Ú"];
var codes = ["©","Û","®","ž","Ü","Ÿ","Ý","$","Þ","%","¡","ß","¢","à","£","á","À","¤","â","Á","¥","ã","Â","¦","ä","Ã","§","å","Ä","¨","æ","Å","©","ç","Æ","ª","è","Ç","«","é","È","¬","ê","É","­","ë","Ê","®","ì","Ë","¯","í","Ì","°","î","Í","±","ï","Î","²","ð","Ï","³","ñ","Ð","´","ò","Ñ","µ","ó","Õ","¶","ô","Ö","·","õ","Ø","¸","ö","Ù","¹","÷","Ú","º","ø","Û","»","ù","Ü","@","¼","ú","Ý","½","û","Þ","€","¾","ü","ß","¿","ý","à","‚","À","þ","á","ƒ","Á","ÿ","å","„","Â","æ","…","Ã","ç","†","Ä","è","‡","Å","é","ˆ","Æ","ê","‰","Ç","ë","Š","È","ì","‹","É","í","Œ","Ê","î","Ë","ï","Ž","Ì","ð","Í","ñ","Î","ò","‘","Ï","ó","’","Ð","ô","“","Ñ","õ","”","Ò","ö","•","Ó","ø","–","Ô","ù","—","Õ","ú","˜","Ö","û","™","×","ý","š","Ø","þ","›","Ù","ÿ","œ","Ú"];
for(x=0; x<chars.length; x++){
for (i=0; i<arguments.length; i++){
arguments[i].value = arguments[i].value.replace(chars[x], codes[x]);
}
}
}
Upvotes: 0