Reputation: 53
I'm trying to figure it out a way to count the number of times that a subset appears in a list of lists. For example if I have the following list:
dataset = [[0,0,1,0,1,0],[0,0,1,0,1,1],[1,0,1,0,1,0],[0,1,1,0,0,0]]
The pattern [0,0,1,0,1,0]
appears in three of the four items of the list (i.e. in three of the lists, the elements at index 2 and index 4 are set to 1, just like in the pattern). How can I count the number of times that the pattern appears?
So far I've tried this, but it does not work:
subsets_count = []
for i in range(len(dataset)):
current_subset_count = 0
for j in range(len(dataset)):
if dataset[i] in dataset[j]:
subset_count += 1
subsets_count.append(current_subset_count)
Upvotes: 0
Views: 244
Reputation: 19223
For each sublist, generate a set of indices where the ones exist. Do the same for the pattern. Then, for each set of indices, find whether the pattern indices are a subset of that set. If so, the pattern is in the sublist.
one_indices_of_subsets = [{i for i, v in enumerate(sublist) if v} for sublist in dataset]
pattern_indices = {i for i, v in enumerate(pattern) if v}
result = sum(1 for s in one_indices_of_subsets if pattern_indices <= s)
print(result)
This outputs:
3
Upvotes: 2
Reputation: 27588
Using one of my favorite itertools, compress:
[sum(all(compress(e, d)) for e in dataset)
for d in dataset]
Results in (Try it online!):
[3, 1, 1, 1]
Upvotes: 2
Reputation: 11
This allows for one digit to be different from the pattern. Straight forward pattern matcher:
dataset = [[0,0,1,0,1,0],[0,0,1,0,1,1],[1,0,1,0,1,0],[0,1,1,0,0,0]]
pattern = [0,0,1,0,1,0]
m = len(pattern)
subsets_count = 0
for i in range(len(dataset)):
count = 0
for j in range(m):
if dataset[i][j] == pattern[j]:
count +=1
if count >= m-1:
subsets_count +=1
print(subsets_count)
Output: 3
Upvotes: 1
Reputation: 195418
Try:
dataset = [
[0, 0, 1, 0, 1, 0],
[0, 0, 1, 0, 1, 1],
[1, 0, 1, 0, 1, 0],
[0, 1, 1, 0, 0, 0],
]
pat = [0, 0, 1, 0, 1, 0]
cnt = sum(all(a == b for a, b in zip(pat, d) if a == 1) for d in dataset)
print(cnt)
Prints:
3
Upvotes: 0
Reputation: 919
if you want to count a pattern (by taking into account the order of the pattern) you can simply use the .count()
function by applying it as follows:
dataset = [[0,0,1,0,1,0],[0,0,1,0,1,1],[1,0,1,0,1,0],[0,1,1,0,0,0],[0,0,1,0,1,0]]
num_count = dataset.count([0,0,1,0,1,0])
print(num_count)
output:
2
and if you dont care about the order of the 0's and ones, you can use:
dataset = [[0,0,1,0,1,0],[0,0,1,0,1,1],[1,0,1,0,1,0],[0,1,1,0,0,0],[0,0,1,0,1,0]]
num_count = [sum(el) for el in dataset].count(sum([0,0,1,0,1,0]))
print(num_count)
output2:
3
Upvotes: 0