DrTchocky
DrTchocky

Reputation: 535

Retrieve number of times lists overlap by index

I have a number of lists that have values of either 1 or 0 and have length 1024. I would like to find the number of times that two given lists overlap over all indexes (but only if they match with values == 1), but can't seem to think of way that keeps the number of comparisons low. Currently my approach is to get all indexes of my lists where the value == 1, and get the intersection of two lists, e.g.

#for each list, do the following
for x,j in enumerate(list1): 
    if j == 1:
       idx_list.append(x) 
# compare  two lists 
num_overlap = set(idx_list1).intersection(idx_list2)

Is this the most efficient way to find this value?

For example input/output (only showing 6 values instead of 1024):

list1 = [1 0 1 0 0 0]
list2 = [1 0 0 0 0 0]
num_overlap = 1 (both lists have ```1``` at index 0) 

Upvotes: 3

Views: 164

Answers (2)

Jean-François Fabre
Jean-François Fabre

Reputation: 140286

Just zip lists together, and apply all on the zipped result to see if it's all non-zeroes (if both elements of the list are "truthy". Issue 1 if it's the case. Sum the generator comprehension.

list1 = [1,0,1,0,0,0]
list2 = [1,0,0,0,0,0]

num_overlap = sum(1 for t in zip(list1,list2) if all(t))

Note: works with any number of lists that you can feed to zip.

Variant: since all(t) evaluates to 1, the code can be shortened a little bit, and we can even use map here to avoid the loop:

num_overlap = sum(map(all,zip(list1,list2)))

Benchmarking both solutions on a great deal of iterations:

2.3228490352630615 (gencomp)
2.1401889324188232 (map)

And the suggested solution using sum(x and y for x,y in zip(list1,list2)) is faster because there's no overhead from calling all

1.9283719062805176

(if you want to generalize with more than 2 lists you cannot use that last one, but if you only have 2 lists, it's the fastest option)

Upvotes: 6

Miriam Farber
Miriam Farber

Reputation: 19634

If you are just interested in the number of times, you can use numpy arrays, multiply them (the product is entrywise) and then sum the product. You will have 1 in the product only if the corresponding entries in both arrays were 1. Here is an example:

import numpy as np    
a=np.array([1,0,0,1])
b=np.array([1,0,1,0])
sum(a*b)

Then a * b=[1,0,0,0], and sum(a * b)=1.

Upvotes: 2

Related Questions