Reputation: 35363
I'm implementing a variation of the SuperFastHash in VBA for use in Excel (32-bit version, so no LongLong available) to hash strings.
To get around the limitations of signed 32-bit Long values, I'm doing the addition and bit-shifting using Double types, and then converting from Double to Long in a way that truncates it at 31 bits (the maximum positive value -- don't want to deal with two's complement and signs).
I'm getting answers and avoiding overflows so far, but I have a suspicion I'm making some mistakes in translation, since most implementations use all 32 bits of a uint and also deal with individual bytes from an array rather than 16-bit values coming from AscW().
Specific portions of the implementation I'm hoping someone can gut-check:
Here's the current code:
Public Function shr(ByVal Value As Long, ByVal Shift As Byte) As Long
shr = Value
If Shift > 0 Then shr = shr \ (2 ^ Shift)
End Function
Public Function shl(ByVal Value As Long, ByVal Shift As Byte) As Long
If Shift > 0 Then
shl = LimitDouble(CDbl(Value) * (2& ^ Shift))
Else
shl = Value
End If
End Function
Public Function LimitDouble(ByVal d As Double) As Long
'' Prevent overflow by lopping off anything beyond 31 bits
Const MaxNumber As Double = 2 ^ 31
LimitDouble = CLng(d - (Fix(d / MaxNumber) * MaxNumber))
End Function
Public Function SuperFastHash(ByVal dataToHash As String) As Long
Dim dataLength As Long
dataLength = Len(dataToHash)
If (dataLength = 0) Then
SuperFastHash = 0
Exit Function
End If
Dim hash As Long
hash = dataLength
Dim remainingBytes As Integer
remainingBytes = dataLength Mod 2
Dim numberOfLoops As Integer
numberOfLoops = dataLength \ 2
Dim currentIndex As Integer
currentIndex = 0
Dim tmp As Double
Do While (numberOfLoops > 0)
hash = LimitDouble(CDbl(hash) + AscW(Mid$(dataToHash, currentIndex + 1, 1)))
tmp = shl(AscW(Mid$(dataToHash, currentIndex + 2, 1)), 11) Xor hash
hash = shl(hash, 16) Xor tmp
hash = LimitDouble(CDbl(hash) + shr(hash, 11))
currentIndex = currentIndex + 2
numberOfLoops = numberOfLoops - 1
Loop
If remainingBytes = 1 Then
hash = LimitDouble(CDbl(hash) + AscW(Mid$(dataToHash, currentIndex + 1, 1)))
hash = hash Xor shl(hash, 10)
hash = LimitDouble(CDbl(hash) + shr(hash, 1))
End If
'' Final avalanche
hash = hash Xor shl(hash, 3)
hash = LimitDouble(CDbl(hash) + shr(hash, 5))
hash = hash Xor shl(hash, 4)
hash = LimitDouble(CDbl(hash) + shr(hash, 17))
hash = hash Xor shl(hash, 25)
hash = LimitDouble(CDbl(hash) + shr(hash, 6))
SuperFastHash = hash
End Function
Upvotes: 0
Views: 613
Reputation: 81115
I would suggest that rather than messing around with doubles, you would probably be better off splitting the 32-bit word into two "16-bit" parts, each of which is held in a signed 32-bit variable (use the lower 16 bits of each variable, and then "normalize" the value between steps:
highPart = (highPart + (lowPart \ 65536)) and 65535
lowPart = lowPart and 65535
Shifting left 16 places simply means copying the low part to the high part and zeroing the low part. Shifting right 16 places simply means copying the high part to the low part and zeroing the high part. Shifting left a smaller number of places simply means shifting both parts separately and then normalizing. Shifting a normalized number right a smaller number of places means shifting both parts left (16-N) bits, normalizing, and shifting right 16 bits.
Upvotes: 1