Reputation: 399
How to create a bloom filter to remove the duplicate elements from a stream of integers in O(n) time complexity & O(1) space complexity ? If possible, i would appreciate if some one can point me in right direction ?
Upvotes: 2
Views: 3652
Reputation: 2812
Google Guava offers a bloom filter implementation.
Note that bloom filter is not enough by itself. If bloom filter claims that a number is not in it, then it's not in it. But if it claims that the number is already in it, there's a chance that it's wrong. So you need to have another datastructure there to be sure and use bloomfilter to reduce the number of lookups in that datastructure.
Upvotes: 1
Reputation: 55619
I'm fairly certain it's just:
For each element:
Now there are two problems with this:
I don't believe either of these problems can be avoided given the constraints - both are integral parts of using (only) bloom filters.
If we weren't dealing with a stream, but rather a list, we could get rid of the false positives by recording all the elements picked up by the bloom filter and go through the list again checking against our candidate list instead to make sure they're actual duplicates. This is still O(n) time, but obviously not O(1) space.
Upvotes: 4