Reputation: 1009
How can I intersect a list and a set to preserve the order of the list? Simple example:
k=[1,2,3,4]
d={3,2}
d.intersection(k)
[2,3]#this is the ideal result
edit: speed is the most important factor here
Upvotes: 1
Views: 1138
Reputation: 1124170
You'll have to use a list comprehension, which lets you keep the order by filtering:
[i for i in k if i in d]
Demo:
>>> k = [1, 2, 3, 4]
>>> d = {2, 3}
>>> [i for i in k if i in d]
[2, 3]
For Python 3, this is the fastest option available; you could use list(filter(d.__contains__, l))
but this is slower on this trivially small example:
>>> from timeit import timeit
>>> def listcomp(k, d):
... return [i for i in k if i in d]
...
>>> def filtered(k, d):
... return list(filter(d.__contains__, k))
...
>>> timeit('listcomp(k, d)', 'from __main__ import listcomp, k, d')
0.49590064199946937
>>> timeit('filtered(k, d)', 'from __main__ import filtered, k, d')
0.6533352420010488
and timings worsen when the data sets grow in size:
>>> import random
>>> k = sorted([random.randrange(1000) for _ in range(1000)])
>>> d = {random.randrange(1000) for _ in range(100)}
>>> timeit('listcomp(k, d)', 'from __main__ import listcomp, k, d', number=10000)
0.30027976899873465
>>> timeit('filtered(k, d)', 'from __main__ import filtered, k, d', number=10000)
0.4524774450001132
In Python 2, filter()
is the faster option, given large enough inputs:
>>> from timeit import timeit
>>> import random
>>> def listcomp(k, d):
... return [i for i in k if i in d]
...
>>> def filtered(k, d):
... return filter(d.__contains__, k)
...
>>> k = [1, 2, 3, 4]
>>> d = {2, 3}
>>> timeit('listcomp(k, d)', 'from __main__ import listcomp, k, d')
0.4015800952911377
>>> timeit('filtered(k, d)', 'from __main__ import filtered, k, d')
0.4407978057861328
>>> k = sorted([random.randrange(1000) for _ in range(1000)])
>>> d = {random.randrange(1000) for _ in range(100)}
>>> timeit('listcomp(k, d)', 'from __main__ import listcomp, k, d', number=10000)
0.4594550132751465
>>> timeit('filtered(k, d)', 'from __main__ import filtered, k, d', number=10000)
0.28088998794555664
I'm not aware of any faster options; the available ordered set implementations are all Pure-python solutions, and slower still.
Upvotes: 7