Bux
Bux

Reputation: 19

From a list of lists, how do I find lists with n elements in common

I have a list of lists

foo = [[1, 2, 4, 17], [1, 2, 4, 12], [1, 2, 5, 17]]

I am interested in finding all possible groups of lists which share n elements.

Expected results:

Lists sharing 3 elements

Lists sharing 2 elements

What I tried so far

What I want to try but seems really complicated and they must be a more convenient way

I posted a minimal example here, but in reality I have a list of hundred of list, each containing 6 elements, so I am afraid of combinatorial explosion as well.

If anyone could offers some guidance or tricks, that will be great.

Many thanks,

Best

Upvotes: 0

Views: 163

Answers (2)

Sushil
Sushil

Reputation: 5531

This should help you:

from itertools import combinations

def isSubset(comb,lst):
    it = iter(lst)
    return all(c in it for c in comb)

foo = [[1, 2, 4, 17], [1, 2, 4, 12], [1, 2, 5, 17]]
n = 3
print('-'*100)
print(f"n = {n}")
existing = []
for index in range(len(foo)):
    combs = combinations(foo[index],n)
    for comb in combs:
        occurrences = 0
        curr_lst = []
        for lst in foo:
            if isSubset(comb,lst):
                if comb not in existing:
                    occurrences += 1
                    curr_lst.append(lst)
                    if occurrences >= 2:
                        if occurrences == 2:
                            print('-' * 100)
                            print(f"Groups containing {comb}")
                            [print(elem) for elem in curr_lst]
                        else:
                            print(lst)
        existing.append(comb)

Output:

----------------------------------------------------------------------------------------------------
n = 3
----------------------------------------------------------------------------------------------------
Groups containing (1, 2, 4)
[1, 2, 4, 17]
[1, 2, 4, 12]
----------------------------------------------------------------------------------------------------
Groups containing (1, 2, 17)
[1, 2, 4, 17]
[1, 2, 5, 17]

Output for n=2:

----------------------------------------------------------------------------------------------------
n = 2
----------------------------------------------------------------------------------------------------
Groups containing (1, 2)
[1, 2, 4, 17]
[1, 2, 4, 12]
[1, 2, 5, 17]
----------------------------------------------------------------------------------------------------
Groups containing (1, 4)
[1, 2, 4, 17]
[1, 2, 4, 12]
----------------------------------------------------------------------------------------------------
Groups containing (1, 17)
[1, 2, 4, 17]
[1, 2, 5, 17]
----------------------------------------------------------------------------------------------------
Groups containing (2, 4)
[1, 2, 4, 17]
[1, 2, 4, 12]
----------------------------------------------------------------------------------------------------
Groups containing (2, 17)
[1, 2, 4, 17]
[1, 2, 5, 17]

Upvotes: 1

Wups
Wups

Reputation: 2569

I don't think there is an easy way around generating all combinations with n elements for each inner list. But every combination needs to be checked only once:

from itertools import combinations
import random

n = 3
foo = [[random.randint(0,100) for _ in range(6)] for _ in range(500)]
#foo = [[1, 2, 4, 17], [1, 2, 4, 12], [1, 2, 5, 17]]

checked = set() # already checked combinations
result = []

for lst in foo:
    cbs = combinations(lst, n)
    for comb in cbs:
        if not comb in checked:
            groups = [l for l in foo if all(i in l for i in comb)]
            if len(groups) > 1:
                result.append((comb, groups))
            checked.add(comb)

print(result)

Output:

[((1, 2, 4), [[1, 2, 4, 17], [1, 2, 4, 12]]),
 ((1, 2, 17), [[1, 2, 4, 17], [1, 2, 5, 17]])]

Performance:

For the randomly generated list with 500 sublists, values from 0-100 or 0-1000 inside the sublists and n=2 or n=3, the code took a few seconds to complete.

Upvotes: 1

Related Questions