Reputation: 19
I have a list of lists
foo = [[1, 2, 4, 17], [1, 2, 4, 12], [1, 2, 5, 17]]
I am interested in finding all possible groups of lists which share n elements.
Expected results:
Lists sharing 3 elements
[[1, 2, 4, 17], [1, 2, 4, 12]]
[[1, 2, 4, 17], [1, 2, 5, 17]]
Lists sharing 2 elements
[[1, 2, 4, 17], [1, 2, 4, 12], [1, 2, 5, 17]]
[[1, 2, 4, 17], [1, 2, 4, 12]]
[[1, 2, 4, 17], [1, 2, 4, 12]]
[[1, 2, 4, 17], [1, 2, 5, 17]]
[[1, 2, 4, 17], [1, 2, 5, 17]]
What I tried so far
What I want to try but seems really complicated and they must be a more convenient way
I posted a minimal example here, but in reality I have a list of hundred of list, each containing 6 elements, so I am afraid of combinatorial explosion as well.
If anyone could offers some guidance or tricks, that will be great.
Many thanks,
Best
Upvotes: 0
Views: 163
Reputation: 5531
This should help you:
from itertools import combinations
def isSubset(comb,lst):
it = iter(lst)
return all(c in it for c in comb)
foo = [[1, 2, 4, 17], [1, 2, 4, 12], [1, 2, 5, 17]]
n = 3
print('-'*100)
print(f"n = {n}")
existing = []
for index in range(len(foo)):
combs = combinations(foo[index],n)
for comb in combs:
occurrences = 0
curr_lst = []
for lst in foo:
if isSubset(comb,lst):
if comb not in existing:
occurrences += 1
curr_lst.append(lst)
if occurrences >= 2:
if occurrences == 2:
print('-' * 100)
print(f"Groups containing {comb}")
[print(elem) for elem in curr_lst]
else:
print(lst)
existing.append(comb)
Output:
----------------------------------------------------------------------------------------------------
n = 3
----------------------------------------------------------------------------------------------------
Groups containing (1, 2, 4)
[1, 2, 4, 17]
[1, 2, 4, 12]
----------------------------------------------------------------------------------------------------
Groups containing (1, 2, 17)
[1, 2, 4, 17]
[1, 2, 5, 17]
Output for n=2
:
----------------------------------------------------------------------------------------------------
n = 2
----------------------------------------------------------------------------------------------------
Groups containing (1, 2)
[1, 2, 4, 17]
[1, 2, 4, 12]
[1, 2, 5, 17]
----------------------------------------------------------------------------------------------------
Groups containing (1, 4)
[1, 2, 4, 17]
[1, 2, 4, 12]
----------------------------------------------------------------------------------------------------
Groups containing (1, 17)
[1, 2, 4, 17]
[1, 2, 5, 17]
----------------------------------------------------------------------------------------------------
Groups containing (2, 4)
[1, 2, 4, 17]
[1, 2, 4, 12]
----------------------------------------------------------------------------------------------------
Groups containing (2, 17)
[1, 2, 4, 17]
[1, 2, 5, 17]
Upvotes: 1
Reputation: 2569
I don't think there is an easy way around generating all combinations with n elements for each inner list. But every combination needs to be checked only once:
from itertools import combinations
import random
n = 3
foo = [[random.randint(0,100) for _ in range(6)] for _ in range(500)]
#foo = [[1, 2, 4, 17], [1, 2, 4, 12], [1, 2, 5, 17]]
checked = set() # already checked combinations
result = []
for lst in foo:
cbs = combinations(lst, n)
for comb in cbs:
if not comb in checked:
groups = [l for l in foo if all(i in l for i in comb)]
if len(groups) > 1:
result.append((comb, groups))
checked.add(comb)
print(result)
Output:
[((1, 2, 4), [[1, 2, 4, 17], [1, 2, 4, 12]]),
((1, 2, 17), [[1, 2, 4, 17], [1, 2, 5, 17]])]
For the randomly generated list with 500 sublists, values from 0-100 or 0-1000 inside the sublists and n=2
or n=3
, the code took a few seconds to complete.
Upvotes: 1