ljere
ljere

Reputation: 31

Mapping long bit arrays to a lookup table

Given a set of n (n < 1000) unique bit arrays (for example c++ std::bitset class template) of length l (l > 64), which is the best mapping to store such bit arrays in a lookup table L with elements from 0 to n-1, to find them the fastest?

What I want to achieve is the following:

L["000010001 ... 00101010"] = 0
L["111000000 ... 01000100"] = 1
...
L["001101100 ... 01010111"] = n-1

The bit arrays, if converted to decimals, are not ordered.

I am currently using std::unordered_map<std::bitset<81>, int> and std::unordered_map::find but I have a feeling that there is a faster way to do it.

Upvotes: 2

Views: 126

Answers (2)

midor
midor

Reputation: 5557

Depends strongly on what you care about:

  • memory: probably a trie
  • performance: a hashset used as sparse array (there are faster ones than std::unordered_map generally, and knowing the input domain often allows you to heavily optimize the hash function)
  • if it is structured, possibly just a mask-operation

Generally the indexing and search of long simple sequences is extensively researched in genetics, so maybe you can find some algorithms there.

Upvotes: 2

Serge Ballesta
Serge Ballesta

Reputation: 148965

std::unordered_map has strong advantages: it exists, has been extensively tested, and is optimized.

The only alternative I can imagine would be a binary search in an array of pairs (bit_pattern, index): requires less that 10 comparisons for an array of size < 1000.

But... needs code, tests and benchmarks...

My now grey hair tell me: if it already exists and meet your needs, then use it

Upvotes: 2

Related Questions