Reputation: 21
I have a small set of integers (large range). I need to be able to query this data structure at 0(1) and say whether or not an integer exists in the collection. I am currently using Java & Colt OpenIntIntHashMap. I have a lot of these OpenIntIntHashMaps data structures, and each contain between 5-15 integers. I query in high volume. I just care for the existence of an integer - I do not care about storing a value for the key. Is there another representation or something faster to help solve this problem?
Upvotes: 1
Views: 596
Reputation: 269677
With such a small "n", I wouldn't base a decision on the complexity of the structure; I'd profile and see what is actually fastest.
An Ο(log(n)) algorithm can be faster than an Ο(1) algorithm up to a certain value of n. The Ο notation just tells you how the algorithm scales. In fact, if you have a lot of these maps, you might worry more about space efficiency.
Upvotes: 1
Reputation: 10003
with 10 numbers, hash set and tree set perform about the same - about 10^7 on my old machine.
import java.util.*;
public class Main {
int n = 10;
Random random = new Random();
Set<Integer> hashSet = new HashSet<Integer>(n);
Set<Integer> treeSet = new HashSet<Integer>(n);
{
List<Integer> numbers = new LinkedList<Integer>();
for (int i = 0; i < n; i++)
numbers.add(random.nextInt());
System.out.println(numbers);
hashSet.addAll(numbers);
treeSet.addAll(numbers);
System.out.println(numbers);
}
int hits, misses;
void init() {
}
long time(Set<Integer> set, int n) {
long t0 = System.currentTimeMillis();
for (int i = 0; i < n; i++)
if (set.contains(random.nextInt())) hits++;
else
misses++;
return System.currentTimeMillis() - t0;
}
void print(long dt, int n, String set) {
System.out.println(set + " " + n + " trials in " + dt + " ms. = " + 1000. * n / dt + "trials/sec.");
}
public static void main(String[] args) {
int trials = 10000000;
long dt=0;
for (int i = 0; i < 10; i++) {
Main main = new Main();
dt = main.time(main.hashSet, trials);
main.print(dt, trials, "hashSet");
dt = main.time(main.treeSet, trials);
main.print(dt, trials, "treeSet");
}
}
}
Upvotes: 2
Reputation:
I'm afraid I'm not an expert, but I think it will be difficult to achieve true O(1) lookup. However, you might want to look at some sort of perfect hashing. If these sets are small then you may have the opportunity to generate a 'perfect' hash function for them - i.e. it would have no collisions among the items in the set. This should then allow you O(1) lookup - you take your integer, hash it, the hash takes you to exactly one place, and then you do exactly one comparison to check if the integer is inside the set or out. You only need to check once because unlike with an imperfect hash, there are no collisions (between anything in the set, things inside and out may collide), so you only need to check one cell in the table to see if an integer is in the set. This should be O(1) lookup, but the cost of computing such a hash function may be high, and the hash function itself may be much more expensive than general purpose hash functions (although still O(1)). I have no experience using these in Java at all, but JPerf seems to have methods to generate them for any kind of Java objects, which I imagine you might be able to modify to specialise to primitive ints (JPerf is GPLv2).
http://www.anarres.org/projects/jperf/
http://en.wikipedia.org/wiki/Hash_function#Minimal_perfect_hashing
Upvotes: 4