Ryan Ballantyne
Ryan Ballantyne

Reputation: 4094

How can I read a file encoded in utf-16 in nodejs?

I have to read a file encoded in UTF-16 using nodejs (in chunks because it is very large). The data from the file will go into a mongodb, so I will need to convert it into utf-8. From googling, it seems that this is just plain not supported by Node, and I will have to resort to converting the raw data from a buffer myself. But I also think there ought to be a better way and I'm just not finding it. Any suggestions?

Thanks.

Upvotes: 28

Views: 24129

Answers (2)

mikemaccana
mikemaccana

Reputation: 123058

Replace the normal utf8 you'd have when reading a text file with utf16le or ucs2:

var fileContents = fs.readFileSync('import.csv','utf16le')

or:

var fileContents = fs.readFileSync('import.csv','ucs2')

Also, for anyone searching the internet: anyone getting additional � (question mark) characters appearing in a parsed file, this is probably the cause of your problem. Read the file as UTF16/UCS2 and the extra characters will disappear.

Upvotes: 45

Matthew Ratzloff
Matthew Ratzloff

Reputation: 4623

Node supports UCS-2, the UTF-16 subset supported by JavaScript. Try using that.

See this pull request.

Upvotes: 25

Related Questions