Reputation: 13458
I want an efficient data structure that represents a set of integers from the range 0..k
, where k
is very small, let's say less than 1024
.
I have come up with an implementation with the following complexities:
initialize: O(1)
size: O(1)
contains: O(1)
insert: O(1)
remove: O(1)
clear: O(1)
union: O(k)
However, I suspect my O(1)
s might really by O(log(k))
since we have to read O(log(k))
digits somewhere.
The memory requirement is O(k)
.
I have not implemented equals
, intersection
or difference
because I don't need them, but they all would be O(k)
. My implementation does support iteration.
This seems like it could come up in many different problems, so I'm wondering has my implementation been done before?
And more importantly: Can we do better?
Obviously better is a little subjective, but something with only O(1)
and O(log(k))
everywhere might be interesting.
I suspect bitsets
could be more practical when k
is a little bigger, but I'm not sure about for such small k
.
Here is pseudocode of what I have. It works by allocating an array and a map (implemented as another array) of lengths k
and k+1
, but only caring about the first m
elements of the array, where m
is the current size of the set.
class IntSet:
m: int
arr: array[int]
map: array[int]
init(k):
arr: int[k] # allocate, but don't initialize
map: int[k+1] # allocate, but don't initialize
m = 0
size(): m
contains(x: int): map[x] < m and arr[map[x]] == x
insert(x: int):
if not contains(x):
arr[m] = x
map[x] = m
m += 1
remove(x: int):
if contains(x):
m -= 1
arr[map[x]] = arr[m] # move x off the end
map[arr[m]] = map[x]
clear(): m = 0
union(s: IntSet):
for i in 0..s.m:
if not contains(s.arr[i]):
insert(s.arr[i])
items():
for i in 0..m:
yield arr[i]
EDIT:
Based on my understanding, I imagine a bitset implementation would look like:
initialize: O(k) size:O(popcount of k bits)O(1) contains: O(1) insert: O(1) remove: O(1) clear: O(k) union: O(bitwise OR of k bits)
If we use a bitset with 1024 bits I'm not sure if this is better.
Upvotes: 0
Views: 889
Reputation: 11284
This will look a little bit hacky, but you can store the set into an array of 64 bit integer int64[16]
or similarly, 32 bit integer int32[32]
, with the first 4 bit determines which index the element belong to, and the last 6 bit determine which bit will be set in the integer. All operations will be O(1) except clear and union will be O(log k / 64), with k is the maximum element in the set
So first, we have int64[16]set
To add an element x
set[x >> 6] |= 1 << (0x3F & x) // Only consider first 6 bit
To remove an element
set[x >> 6] ^= 1 << (0x3F & x)
To clear:
for int i = 0; i < 16; i++{
set[i] = 0;
}
To union two set a
and b
for int i = 0; i < 16; i++{
set[i] = a[i] | b[i];
}
As requested, check if set contains x
(set[x >> 6] & (1 << (0x3F & x))) != 0
To keep track the number of element in the set, an obvious solution is to go through every bit in each element in the array, which would be an O(k) time complexity.
There exists few solutions to count the number of bit in O(1), like this one How to count the number of set bits in a 32-bit integer?
In addition, if we change to use int8[128]
set instead of current solution, which mean we only using the first 3 bit of the number to determine which bit to be set, we can have a hardcoded array to keep track of how many bit it is, so, for example:
int numberOfElement = 0;
int[1<<8]bitCount // Pre-populated value, so bitCount[i] will give an answer of how many bit is set in i
for int i = 0; i < 128; i++ {
set[i] = a[i] | b[i];
numberOfElement += bitCount[set[i]];
}
Upvotes: 1
Reputation: 11968
Asymptotically your data structure is fairly neat (size is O(K)
since you need about 2*k ints to store everything).
I think the constant might be a bit high and memory usage is less than ideal.
Memory side, you are right a bitset of 1024 bits (128 bytes) would work just as well. Compare that with your set 2*1024 * 4 = 8Kb.
You are concerned about initialization. You need to allocate memory anyway and there's a good chance you can find 128 bytes already available (compared to 8K). You need to initialize them but on modern architectures that might be one or two SIMD instructions (the compiler would do it for you when optimizations are enabled and you specify a target platform that supports them).
In addition 128 bytes will fit in 2 64 byte cache lines, so all your data will likely fit into L1 cache.
Contains is called a lot in your code. In your suggestion you need to perform arr[map[x]] == x
. That's two chained lookup's. You have 2x the chance for a cache miss (since the data is so large). Also it can't be easily optimized (CPU needs to wait for the value from the first lookup before it can issue the second one).
So all in all, except on memory, the two data-structures are fairly similar. In practice I would bet the bit-set is going to be significantly faster, especially if you enable optimizations in your compiler.
In any case, to decide you should write a benchmark of your code and run it on the intended platform. Then swap the data-structures and pick the one that gives you the best results.
Upvotes: 1