Efficient search for collisions in multiple lists

Question

I have a multiple lists with data of the form:(There is a simple example, in fact, the dimension of row-vectors are much larger)

list 1: [num1] [[1,0,0,1,0], [0,0,1,0,1], [0,1,0,1,0], ...]
list 2: [num2] [[0,0,0,1,0], [1,0,0,1,0], [0,0,1,0,0], ...]

...
list n: [numn] [[1,1,0,1,0], [1,0,0,1,1], [0,0,1,0,1], ...]

Every list marked with its own number [num] (numbers are not repeated).

The main question is: How to efficently find all num's of lists with identical row-vectors from them and such vectors?

In details:

For example, the row-vector [1,0,0,1,0] occurs in list 1 and list 2, so then I should return [1,0,0,1,0] : [num1], [num2]

First of all hash tables come to mind. I think it's best to use due to the large amount of data but I know hash tables quite superficially and I can’t structurize a clear algorithm in my head with this case. Can anyone advise what should I pay attention to and what modules should I consider? Perhaps there are other efficient approaches?

Efficient search for collisions in multiple lists

Answers (1)

Output:

Related Questions