Reputation: 2552
I am implementing, in go, a rolling version of the adler32 checksum.
This answer was helpful to double check my maths. However I am struggling at implementing it correctly in golang.
I wrote the following code:
func roll(adler, n, leave, enter uint32) uint32 {
a := adler & 0xffff
b := adler >> 16
a = (a + enter - leave) % MOD
b = (b - n*leave - 1 + a) % MOD
return b<<16 | a
}
It tested it on various inputs and it worked fine, until I decided to run it on random data. Here is a sample where it does not work (I found several of them).
What is baffling me is that the same code in python works perfectly on those inputs:
def roll(adler, n, leave, enter):
a = adler & 0xffff
b = adler >> 16
a = (a + enter - leave) % MOD
b = (b - n*leave - 1 + a) % MOD
return b<<16 | a
For good measure, I am including proof that this works in python. Note that the python checksum matches the non-rolling version of the go checksum (and that part is directly from the go core libraries).
I studied my results on all the other problematic samples, and found that I am never making a mistake on the least significant bits of the checksum (the "a" bits). Also, the error is consistently the same, equals to 0xe10000
. I suspect a peculiarity of how go handles modulo operations on uint32 integers to be the cause of this.
What is happening and how do I fix my code?
Upvotes: 2
Views: 718
Reputation: 112374
The integers in Python are signed. You declared all the integers in the golang version to be unsigned. That's the difference.
When an unsigned number is subtracted from a smaller unsigned number, you get a huge unsigned number that gives a different remainder on division than the small negative difference would. When you wrap, you are effectively adding 232. 232 mod 65521 is 225, or 0xe1
, which is why you're seeing that difference in b
. It's much more likely to wrap on the b
calculation, but it can happen for a
as well, if a
happens to be very small at that step.
Per the comment by @samgak, you also have to worry about the definition of the % operator in different languages for signed values. So the solution that works across different conventions would be to make the values positive by adding as many MOD
s as necessary, before doing the % MOD
. For a
, just add MOD
. For b
, add (1 + n * leave / MOD) * MOD
.
Take care to make sure that the intermediate values don't overflow. The code in go can give erroneous results if n*leave
is large enough to wrap the integer type being used.
Upvotes: 5
Reputation: 82934
I don't know go, but here are some possibilities:
b := adler >> 16 change to b := (adler >> 16) & 0xffff
b = (b - n*leave - 1 + a) % MOD ... what happens if expression in () is negative?
return b<<16 | a ... check operator precedence; (b<<16)|a or b<<(16|a) ?
32-bit machine or 64-bit?
Upvotes: 0