brandonscript
brandonscript

Reputation: 72855

Buffer.byteLength returning drastically different byte length

Using Node.js v5.1.0, I'm trying to determine the content length of a buffer. As such, I'm doing this:

Buffer.byteLength(self.data, 'utf8')

where self.data looks like this:

<Buffer ff d8 ff e0 00 10 4a 46 49 46 00 01 01 01 00 f0 00 f0 00 00 ff db 00 43 00 05 03 04 04 04 03 05 04 04 04 05 05 05 06 07 0c 08 07 07 07 07 0f 0b 0b 09 ... >

The image I'm loading is 109,055 bytes (111 KB on disk) on the file system (OS X), but my content length calculation is returning 198,147 bytes. If I set the encoding to ascii, it returns 104,793 bytes. Much closer, but still not correct.

Am I calculating this correctly? Do I need to do something to the buffer to get it to return the correct value? If I'm doing it right, why the discrepancy? If I'm doing it wrong, well, please share ;)

Upvotes: 7

Views: 14069

Answers (3)

Venryx
Venryx

Reputation: 17979

I think the differences are as follows: (EDIT: As per this comment, it appears some of the interpretations below are incorrect; awaiting corrected descriptions/summaries)

  • buffer.length: Buffer's used+reserved length.
  • buffer.byteLength: Buffer's used length.
  • buffer.toString().length: Number of characters in the default string rendering/interpretation of the buffer's bytes. You can change that rendering/intepretation by supplying a different encoding. (eg. toString("ascii"))
  • Buffer.byteLength(buffer, targetEncoding): Understanding the bytes in "buffer" to be using UTF-8 encoding (ie. the bytes are meant to be rendered/interpreted as a UTF-8 string), this returns the number of bytes needed to store that UTF-8 string in some specified encoding (targetEncoding).

On that last item (Buffer.byteLength()), I'm not completely sure my interpretation is correct, but that's the best I can figure from the brief description here: https://nodejs.org/api/buffer.html#buffer_class_method_buffer_bytelength_string_encoding

EDIT: As per this comment, the descriptions above appear to be incorrect for one of either buffer.length or buffer.byteLength (as they had a Buffer whose byteLength was higher than its length). If someone finds the explanation for the mismatch, please comment or edit this post with the fix!

Upvotes: 11

phoenixdown
phoenixdown

Reputation: 838

The confusion in the OP is over the Buffer.byteLength() function, which takes a string and encoding as arguments, and is distinct from the .byteLength property on a buffer object.

The top voted answer also confuses these two concepts (Buffer.byteLength returning drastically different byte length), but I can no longer change my vote on it.

Here is an explanation:

let buf = Buffer.alloc(4); // <Buffer 00 00 00 00>
buf.length; // 4
buf.byteLength; // 4
buf.byteLength === buf.Length //true

At this point we know that .byteLength and .length as properties of a buffer object are strictly equal.

buf.write('ab'); // 2 - returns length of string 'ab' written to buf
buf.length // 4
buf.byteLength // 4

Both .length. and .byteLength return 4. There is no difference being made between the reserved length and used length, as the linked answer claims.

The accepted answer is correct about the Buffer.byteLength(string, encoding='utf-8') function (distinct from the .byteLength property), which gives the byte length of a string encoded with the specified encoding argument (default utf-8).

Upvotes: 6

Amit
Amit

Reputation: 46323

As explained in the documentation, Buffer.byteLength() returns the byte length of a string assuming a specific encoding.

The Buffer type is actually an ArrayBuffer which means it's length can be acquired via the byteLength property. Also, Node's implementation adds a length property that provides the same length.

Upvotes: 9

Related Questions