Reputation: 3
I've created a simple buffer, and I've put into it my string "Héllo"
Here's my code:
var str = "Héllo";
var buff = new Buffer(str.length);
buff.write(str);
console.log(buff.toString("utf8"));
however, it returns "Héll"
and not Héllo
, why?
How can I fix that?
Upvotes: 0
Views: 1380
Reputation: 588
UTF-8 characters can have different length - from 1 byte to 4 bytes - look at this answer https://stackoverflow.com/a/9533324/4486609
So, it is not ok to assume that it have 2 bytes, as you did.
As for the right length, look at this https://stackoverflow.com/a/9864762/4486609
Upvotes: 1
Reputation: 70125
.length
is reporting the number of chars, not the number of bytes. But new Buffer()
is expecting the number of bytes. The 'é' requires two bytes. So the last char is falling off the end of the buffer and being truncated.
If you don't need to support anything older than Node.js 4.x.x, you can use Buffer.from()
:
let buffer = Buffer.from('Héllo');
console.log(buffer.toString()); // 'Héllo'
Upvotes: 1