rubberband876
rubberband876

Reputation: 63

Is a bloom filter with only one input hash still a bloom filter?

If I implement a bloom filter where only one hashing algorithm is used (e.g. murmur), is this still considered a bloom filter?

For example, if a hashes to 5, then bit 5 of the filter will be set. If b hashes to 1, then bit 1 of the filter will be set, and so on...

For something to be considered a bloom filter, do at least two bits in the filter have to be set? If only one bit is set, is it called something else?

Upvotes: 6

Views: 956

Answers (1)

Thomas Mueller
Thomas Mueller

Reputation: 50127

It is still a Bloom filter: one with k=1. Depending on the bits per element, it is probably not be the most space-saving one. But there are various reasons why one might pick a k that is not round(bitsPerKey * log(2)), the main ones are:

  • To be able to better compress: here a Bloom filter with k=1 is best. See also the paper "Compressed Bloom Filters" from Michael Mitzenmacher.
  • To speed up lookup and update: using a lower k is faster.

By the way, you can still pick k to be the most space-saving one, even if you only use one "application hash function" (like Murmur hash with 64 bits). You just pick the "Bloom hash functions" to be a function of this "application hash function" (64-bit Murmur hash), like so (assuming int is 32 bit and long is 64 bit):

long m = murmur(x)
h(x, i) = (int) (m >> 32) + i * (int) m 

And that's actually easier and faster than calculating multiple "application hash functions". In a loop, that looks like this:

long m = murmur(x)
int hash = (int) (m >> 32);
int add = (int) m;
for (int i = 0; i < k; i++) {
    ... test / set the bit depending on "hash" ...
    hash += add;
}

Many Bloom filter libraries do it like this, for example the Bloom filter implementation in Guava.

Upvotes: 3

Related Questions