Reputation: 1085
>>> a = [1,1,1,2,3,4,4]
>>> b = [1,1,2,3,3,3,4]
[1,1,2,3,4]
Please note this is not the same question as this: Python intersection of two lists keeping duplicates Because even though there are three 1s in list a, there are only two in list b so the result should only have two.
Upvotes: 35
Views: 25216
Reputation: 101
The accepted solution posted using Counter is simple, but I think this approach using a dictionary will work too and can be faster -- even on lists that aren't ordered (that requirement wasn't really mentioned, but at least one of the other solutions assumes that is the case).
a = [1, 1, 1, 2, 3, 4, 4]
b = [1, 1, 2, 3, 3, 3, 4]
def intersect(nums1, nums2):
match = {}
for x in nums1:
if x in match:
match[x] += 1
else:
match[x] = 1
i = []
for x in nums2:
if x in match:
i.append(x)
match[x] -= 1
if match[x] == 0:
del match[x]
return i
def intersect2(nums1, nums2):
return list((Counter(nums1) & Counter(nums2)).elements())
timeit intersect(a,b)
100000 loops, best of 3: 3.8 µs per loop
timeit intersect2(a,b)
The slowest run took 4.90 times longer than the fastest. This could mean
that an intermediate result is being cached.
10000 loops, best of 3: 20.4 µs per loop
I tested with lists of random ints of size 1000 and 10000 and it was faster there too.
a = [random.randint(0,100) for r in xrange(10000)]
b = [random.randint(0,100) for r in xrange(1000)]
timeit intersect(a,b)
100 loops, best of 3: 2.35 ms per loop
timeit intersect2(a,b)
100 loops, best of 3: 4.2 ms per loop
And larger lists that would have more common elements
a = [random.randint(0,10) for r in xrange(10000)]
b = [random.randint(0,10) for r in xrange(1000)]
timeit intersect(a,b)
100 loops, best of 3: 2.07 ms per loop
timeit intersect2(a,b)
100 loops, best of 3: 3.41 ms per loop
Upvotes: 6
Reputation: 16856
Simple with no additional imports and easy to debug :)
Disadvantage: The value of list b is changed. Work on a copy of b if you don't want to change b.
c = list()
for x in a:
if x in b:
b.remove(x)
c.append(x)
Upvotes: 8
Reputation: 121
This should also work:
def list_intersect(lisA, lisB):
""" Finds the intersection of 2 lists including common duplicates"""
Iset = set(lisA).intersection(set(lisB))
Ilis = []
for i in Iset:
num = min(lisA.count(i), lisB.count(i))
for j in range(num):
Ilis.append(i)
return Ilis
Upvotes: 2
Reputation: 555
This should also works.
a = [1, 1, 1, 2, 3, 4, 4]
b = [1, 1, 2, 3, 3, 3, 4]
c = []
i, j = 0, 0
while i < len(a) and j < len(b):
if a[i] == b[j]:
c.append(a[i])
i += 1
j += 1
elif a[i] > b[j]:
j += 1
else:
i += 1
print(c) # [1, 1, 2, 3, 4]
Upvotes: 4
Reputation: 3069
This will do:
from itertools import chain
list(chain.from_iterable([(val,)*min(a.count(val), b.count(val)) for val in (set(a) & set(b))]))
Gives:
[1, 1, 2, 3, 4]
Upvotes: 2
Reputation: 29680
You can use collections.Counter
for this, which will provide the lowest count found in either list for each element when you take the intersection.
from collections import Counter
c = list((Counter(a) & Counter(b)).elements())
Outputs:
[1, 1, 2, 3, 4]
Upvotes: 64