Reputation: 4959

Super long arrays in C++

I have two sets A and B. Set A contains unique elements. Set B contains all elements. Each element in the B is a 10 by 10 matrix where all entries are either 1 or 0. I need to scan through set B and everytime i encounter a new matrix i will add it to set A. Therefore set A is a subset of B containing only unique matrices.

Upvotes: 2

Answers (5)

overcoder

Reputation: 1553

Here is some code, maybe not very efficient :

# include <vector>
# include <bitset>
# include <algorithm>

// I assume your 10x10 boolean matrix is implemented as a bitset of 100 bits.

// Comparison of bitsets
template<size_t N>
class bitset_comparator
{
    public :
      bool operator () (const std::bitset<N> & a, const std::bitset<N> & b) const
      {
          for(size_t i = 0 ; i < N ; ++i)
          {
              if( !a[i] && b[i] )       return true ;
              else if( !b[i] && a[i] )  return false ;
          }
          return false ;
      }
} ;

int main(int, char * [])
{
    std::set< std::bitset<100>, bitset_comparator<100> > A ;
    std::vector< std::bitset<100> >                      B ; 


    // Fill B in some manner ...

    // Keeping unique elements in A
    std::copy(B.begin(), B.end(), std::inserter(A, A.begin())) ;
}

You can use std::listinstead of std::vector. The relative order of elements in B is not preserved in A (elements in A are sorted).

EDIT : I inverted A and B in my first post. It's correct now. Sorry for the inconvenience. I also corrected the comparison functor.

Upvotes: 3

Tugrul Ates

Reputation: 9687

You don't need N buckets where N is the number of all possible inputs. A binary tree will just do fine. This is implemented with set class in C++.

vector<vector<vector<int> > > A; // vector of 10x10 matrices
// fill the matrices in A here

set<vector<vector<int> > > B(A.begin(), A.end()); // voila!
// now B contains all elements in A, but only once for duplicates

Upvotes: 0

Mark Ransom

Reputation: 308432

Convert each matrix into a string of 100 binary digits. Now run it through the Linux utilities:

sort | uniq

If you really need to do this in C++, it is possible to implement your own merge sort, then the uniq part becomes trivial.

Upvotes: 0

paxdiablo

Reputation: 882146

Each element in the B is a 10 by 10 matrix where all entries are either 1 or 0.

Good, that means it can be represented by a 100-bit number. Let's round that up to 128 bits (sixteen bytes).

One approach is to use linked lists - create a structure like (in C):

typedef struct sNode {
    unsigned char bits[16];
    struct sNode *next;
};

and maintain the entire list B as a sorted linked list.

The performance will be somewhat less ^(a) than using the 100-bit number as an array index into a truly immense (to the point of impossible given the size of the known universe) array.

When it comes time to insert a new item into B, insert it at its desired position (before one that's equal or greater). If it was a brand new one (you'll know this if the one you're inserting before is different), also add it to A.

^(a) Though probably not unmanageably so - there are options you can take to improve the speed.

One possibility is to use skip lists, for faster traversal during searches. These are another pointer that references not the next element but one 10 (or 100 or 1000) elements along. That way you can get close to the desired element reasonably quickly and just do the one-step search after that point.

Alternatively, since you're talking about bits, you can divide B into (for example) 1024 sub-B lists. Use the first 10 bits of the 100-bit value to figure out which sub-B you need to use and only store the next 90 bits. That alone would increase search speed by an average of 1000 (use more leading bits and more sub-Bs if you need improvement on that).

You could also use a hash on the 100-bit value to generate a smaller key which you can use as an index into an array/list, but I don't think that will give you any real advantage over the method in the previous paragraph.

Upvotes: 1

ObscureRobot

Reputation: 7336

It seems like you might really be looking for a way to manage a large, sparse array. Trivially, you could use a hash map with your giant index as your key, and your data as the value. If you talk more about your problem, we might be able to find a more appropriate data structure for your problem.

Update:

If set B is just some set of matrices and not the set of all possible 10x10 binary matrices, then you just want a sparse array. Every time you find a new matrix, you compute its key (which could simply be the matrix converted into a 100 digit binary value, or even a 100 character string!), look up that index. If no such key exists, insert the value 1 for that key. If the key does exist, increment and re-store the new value for that key.

Upvotes: 4

Super long arrays in C++

Answers (5)

Related Questions