richardtallent
richardtallent

Reputation: 35363

Is this SuperFashHash implementation computing a proper 31-bit hash of a string?

I'm implementing a variation of the SuperFastHash in VBA for use in Excel (32-bit version, so no LongLong available) to hash strings.

To get around the limitations of signed 32-bit Long values, I'm doing the addition and bit-shifting using Double types, and then converting from Double to Long in a way that truncates it at 31 bits (the maximum positive value -- don't want to deal with two's complement and signs).

I'm getting answers and avoiding overflows so far, but I have a suspicion I'm making some mistakes in translation, since most implementations use all 32 bits of a uint and also deal with individual bytes from an array rather than 16-bit values coming from AscW().

Specific portions of the implementation I'm hoping someone can gut-check:

  1. How I'm testing 16-bit character words rather than 4-byte chunks.
  2. Whether my bit-shifting operations are technically correct, given the caveat that I need to truncate Long values at 31 bits.
  3. Whether the final avalanche piece is still appropriate given the hash only uses 31 bits.

Here's the current code:

Public Function shr(ByVal Value As Long, ByVal Shift As Byte) As Long
    shr = Value
    If Shift > 0 Then shr = shr \ (2 ^ Shift)
End Function

Public Function shl(ByVal Value As Long, ByVal Shift As Byte) As Long
    If Shift > 0 Then
        shl = LimitDouble(CDbl(Value) * (2& ^ Shift))
    Else
        shl = Value
    End If
End Function

Public Function LimitDouble(ByVal d As Double) As Long
    '' Prevent overflow by lopping off anything beyond 31 bits
    Const MaxNumber As Double = 2 ^ 31
    LimitDouble = CLng(d - (Fix(d / MaxNumber) * MaxNumber))
End Function

Public Function SuperFastHash(ByVal dataToHash As String) As Long
    Dim dataLength As Long
    dataLength = Len(dataToHash)
    If (dataLength = 0) Then
        SuperFastHash = 0
        Exit Function
    End If
    Dim hash As Long
    hash = dataLength
    Dim remainingBytes As Integer
    remainingBytes = dataLength Mod 2
    Dim numberOfLoops As Integer
    numberOfLoops = dataLength \ 2
    Dim currentIndex As Integer
    currentIndex = 0
    Dim tmp As Double
    Do While (numberOfLoops > 0)
        hash = LimitDouble(CDbl(hash) + AscW(Mid$(dataToHash, currentIndex + 1, 1)))
        tmp = shl(AscW(Mid$(dataToHash, currentIndex + 2, 1)), 11) Xor hash
        hash = shl(hash, 16) Xor tmp
        hash = LimitDouble(CDbl(hash) + shr(hash, 11))
        currentIndex = currentIndex + 2
        numberOfLoops = numberOfLoops - 1
    Loop
    If remainingBytes = 1 Then
        hash = LimitDouble(CDbl(hash) + AscW(Mid$(dataToHash, currentIndex + 1, 1)))
        hash = hash Xor shl(hash, 10)
        hash = LimitDouble(CDbl(hash) + shr(hash, 1))
    End If
    '' Final avalanche
    hash = hash Xor shl(hash, 3)
    hash = LimitDouble(CDbl(hash) + shr(hash, 5))
    hash = hash Xor shl(hash, 4)
    hash = LimitDouble(CDbl(hash) + shr(hash, 17))
    hash = hash Xor shl(hash, 25)
    hash = LimitDouble(CDbl(hash) + shr(hash, 6))
    SuperFastHash = hash
End Function

Upvotes: 0

Views: 613

Answers (1)

supercat
supercat

Reputation: 81115

I would suggest that rather than messing around with doubles, you would probably be better off splitting the 32-bit word into two "16-bit" parts, each of which is held in a signed 32-bit variable (use the lower 16 bits of each variable, and then "normalize" the value between steps:

highPart = (highPart + (lowPart \ 65536)) and 65535
lowPart = lowPart and 65535

Shifting left 16 places simply means copying the low part to the high part and zeroing the low part. Shifting right 16 places simply means copying the high part to the low part and zeroing the high part. Shifting left a smaller number of places simply means shifting both parts separately and then normalizing. Shifting a normalized number right a smaller number of places means shifting both parts left (16-N) bits, normalizing, and shifting right 16 bits.

Upvotes: 1

Related Questions