0__
0__

Reputation: 67330

Migrate from MurmurHash to MurmurHash3

In Scala 2.10, MurmurHash for some reason is deprecated, saying I should use MurmurHash3 now. But the API is different, and there is no useful scaladocs for MurmurHash3 -> fail.

For instance, current code:

trait Foo {
  type Bar
  def id: Int
  def path: Bar

  override def hashCode = {
    import util.MurmurHash._
    var h = startHash(2)
    val c = startMagicA
    val k = startMagicB
    h = extendHash(h, id, c, k)
    h = extendHash(h, path.##, nextMagicA(c), nextMagicB(k))
    finalizeHash(h)
  }
}

How would I do this using MurmurHash3 instead? This needs to be a fast operation, preferably without allocations, so I do not want to construct a Product, Seq, Array[Byte] or whathever MurmurHash3 seems to be offering me.

Upvotes: 7

Views: 3080

Answers (2)

Rex Kerr
Rex Kerr

Reputation: 167911

The MurmurHash3 algorithm was changed, confusingly, from an algorithm that mixed in its own salt, essentially (c and k), to one that just does more bit-mixing. The basic operation is now mix, which you should fold over all your values, after which you should finalizeHash (the Int argument for length is for convenience also, to help with distinguishing collections of different length). If you want to replace your last mix by mixLast, it's a little faster and removes redundancy with finalizeHash. If it takes you too long to detect what the last mix is, just mix.

Typically for a collection you'll want to mix in an extra value to indicate what type of collection it is.

So minimally you'd have

override def hashCode = finalizeHash(mixLast(id, path.##), 0)

and "typically" you'd

// Pick any string or number that suits you, put in companion object
val fooSeed = MurmurHash3.stringHash("classOf[Foo]")   

// I guess "id" plus "path" is two things?
override def hashCode = finalizeHash(mixLast( mix(fooSeed,id), path.## ), 2)

Note that the length field is NOT there to give a high-quality hash that mixes in that number. All mixing of important hash values should be done with mix.

Upvotes: 7

user500592
user500592

Reputation:

Looking at the source code of MurmurHash3 suggests something like this:

override def hashCode = {
  import util.hashing.MurmurHash3._

  val h = symmetricSeed // I'm not sure which seed to use here
  val h1 = mix(h, id)
  val h2 = mixLast(h1, path ##)
  finalizeHash(h2, 2)
}

or, in (almost) one line:

import util.hashing.MurmurHash3._
override def hashCode = finalizeHash(mix(mix(symmetricSeed, id), path ##), 2)

Upvotes: 4

Related Questions