Michel Ange
Michel Ange

Reputation: 3

Node.js buffer with é character

I've created a simple buffer, and I've put into it my string "Héllo"

Here's my code:

var str = "Héllo";
var buff = new Buffer(str.length);
buff.write(str);

console.log(buff.toString("utf8"));

however, it returns "Héll" and not Héllo, why?

How can I fix that?

Upvotes: 0

Views: 1380

Answers (2)

Artem Dudkin
Artem Dudkin

Reputation: 588

UTF-8 characters can have different length - from 1 byte to 4 bytes - look at this answer https://stackoverflow.com/a/9533324/4486609

So, it is not ok to assume that it have 2 bytes, as you did.

As for the right length, look at this https://stackoverflow.com/a/9864762/4486609

Upvotes: 1

Trott
Trott

Reputation: 70125

.length is reporting the number of chars, not the number of bytes. But new Buffer() is expecting the number of bytes. The 'é' requires two bytes. So the last char is falling off the end of the buffer and being truncated.

If you don't need to support anything older than Node.js 4.x.x, you can use Buffer.from():

let buffer = Buffer.from('Héllo');
console.log(buffer.toString()); // 'Héllo'

Upvotes: 1

Related Questions