Sergey
Sergey

Reputation: 758

Buffer comparison in Node.js

I'm new in Node.js. There aren't Buffer comparison and I should use modules like buffertools for these feature.

But I see a pretty strange behaviour when I compare Buffer objects in pure Node.

> var b1 = new Buffer([170]);
> var b2 = new Buffer([171]);
> b1
<Buffer aa>
> b2
<Buffer ab>
> b1 < b2
false
> b1 > b2
false
> b1 == b2
false

and

> var b1 = new Buffer([10]);
> var b2 = new Buffer([14]);
> b1
<Buffer 0a>
> b2
<Buffer 0e>
> b1 > b2
false
> b1 < b2
true
> b1 == b2
false

What actually happens under the hood?

Upvotes: 23

Views: 10441

Answers (2)

flow
flow

Reputation: 3672

There's already an accepted answer but I thought I might still as well chime in with a remark since I don't find the accepted answer particularly clear or helpful. It's even incorrect if only because it answers questions that the OP didn't ask. So let's boil that down:

> var b1 = new Buffer([170]);
> var b2 = new Buffer([171]);
> b1 < b2
> b1 > b2
> b1 == b2

All that is asked for is: "how do I perform equivalence and less than / greater than comparison (a.k.a. (total) ordering) on buffers".

The answer is:

  • either do it manually by stepping through all the bytes of both buffers and perform a comparison between the corresponding bytes, e.g. b1[ idx ] === b2[ idx ],

  • or use Buffer.compare( b1, b2 ) which gives you one of -1, 0, or +1, depending on whether the first buffer would sort before, exactly like, or after the second (sorting a list d that contains buffers is then as easy as d.sort( Buffer.compare )).

Observe I use === in my first example; my frequent comments on this site concerning the abuse of == in JavaScript should make it abundantly clear why that is so.

Upvotes: 10

Zirak
Zirak

Reputation: 39848

That's how the comparison operators work on objects:

var a = {}, b = {};
a === b; //false
a == b; //false
a > b; //false
a < b; //false

var c = { valueOf : function () { return 0; } };
var d = { valueOf : function () { return 1; } };
c === d; //false
c == d; //false
c > d; //false
c < d; //true

Under the hood

(sort of)

Part 1 : Equality

This is the easiest part. Both abstract equality (==, spec) and strict equality (===, spec) check if you're referring to the same object (sort of comparing references). In this case, they are obviously not, so they answer is false (== spec step 10, === spec step 7).

Therefore, in both cases:

b1 == b2 //false
b1 === b2 //false

Part 2: The Comparison strikes back

Here comes the interesting part. Let's look at how the relational operators (< and >) are defined. Let's follow the call chain in the two cases.

x = b1 //<Buffer aa>
y = b2 //<Buffer ab>

//11.8.5 The Abstract Relational Comparison Algorithm (http://es5.github.com/#x11.8.5)
Let px be the result of calling ToPrimitive(x, hint Number).
Let py be the result of calling ToPrimitive(y, hint Number).

//9.1 ToPrimitive (http://es5.github.com/#x9.1)
InputType is Object, therefore we call the internal [[DefaultValue]] method with hint Number.

//8.12.8 [[DefaultValue]] (hint) http://es5.github.com/#x8.12.8
We try and fetch the object's toString method. If it's defined, call it.

And here we've reached the climax: What's a buffer's toString method? The answer lies deep inside node.js internals. If you want, have at it. What we can find out trivially is by experimentation:

> b1.toString()
'�'
> b2.toString()
'�'

okay, that wasn't helpful. You'll notice that in the Abstract Relational Comparison Algorithm (what a big fancy name for <), there's a step for dealing with strings. It just converts them to their numeric value - the char codes. Let's do that:

> b1.toString().charCodeAt(0)
65533
> b2.toString().charCodeAt(0)
65533

65533 is an important number. It's the sum of two squares: 142^2 + 213^2. It also happens to be the Unicode Replacement Character, a character signifying "I have no idea what happened". That's why its hexadecimal equivalent is FFFD.

Obviously, 65533 === 65533, so:

b1 < b2 //is
b1.toString().charCodeAt(0) < b2.toString().charCodeAt(0) //is
65533 < 65533 //false
b1 > b2 //following same logic as above, false

And that's that.

Dude, what the hell?

Okay, this must've been confusing since my efforts of explanation haven't been well thought through. To recap, here's what happened:

  1. You created a buffer. Benjamin Gruenbaum helped me recreate your test case by doing:

    var b1 = new Buffer([170]), b2 = new Buffer([171]);

  2. When outputting to console, the values are turned into their hex equivalent (see Buffer#inspect):

    170..toString(16) === 'aa'

    171..toString(16) === 'ab'

  3. However, internally, they represented invalid characters (since it's not hex encoding; again, you're free to delve into the implementation nitty gritty, I won't (oh the irony)). Therefore, when converted to a string, they were represented with the Unicode replacement character.

  4. Since they're different objects, any equality operator will return false.

  5. However, due to the way less-than and greater-than work, they were turned into strings (and then to numbers) for comparison. In light of point #3, that's the same value; therefore, they cannot be less-than or greater-than each other, leading to false.

Finally, just to put a smile on your face:

b1 <= b2 //true
b1 >= b2 //true

Upvotes: 43

Related Questions