Reputation: 4892
Im parsing a website which is using Windows-1250 charset and for the last 3 days I can't get my page to show the data in the same encoding. My guess is that the problem is somewhere in getting the data from the buffer or to the buffer. I tried installing the IConv module but there was a whole new set of problems so I was wondering whether there was a way to fix this without using iconv.
Basicly, Im getting "ANDRIJAŠEVCI" from the website and after the code below i get "ANDRIJA?EVCI"
var options2 = {
host: 'vred.hzinfra.hr',
path: '/hzinfo/default.asp?Category=hzinfo&Service=vred3',
headers: {"Accept-Charset": "Windows-1250,utf-8;ISO-8859-3,utf-8;ISO-8859-2,utf-8", "Content-Type": "text/html; charset=ISO-8859-2" }
}
var request2 = http.request(options2, function (res){
var data = new Buffer(0,'utf-8');
res.on('data', function (chunk) {
data = Buffer.concat([data,chunk]);
});
res.on('end', function () {
console.log(data.toString('utf-8'));
});
});
request2.end();
Upvotes: 2
Views: 7076
Reputation: 150882
There are several problems in your code.
utf8
, not utf-8
in Node.js, hence it can not work.Windows-1250
, but you deal with it as utf-8
. This can not work, too.Windows-1250
encoding, so this won't work using pure Node.js, no matter what you do (except you are going to convert the raw bytes, but I wouldn't recommend that for obvious reasons).So, to cut a long story short: What you want is (hardly) possible without an additional library. Basically, you already found the way to go (iconv
), but you wrote that there were some additional problems. As you did not say what these problems were, I can only give you the quite generic advise that your code should look somewhat like this:
converter = new iconv.Iconv('windows-1250', 'utf8');
data = converter.convert(data).toString();
Hope this helps…
Upvotes: 2