Reputation: 9592
Is there a big difference in speed in these two code fragments?
1.
x = set( i for i in data )
versus:
2.
x = set( [ i for i in data ] )
I've seen people recommending set()
instead of set([])
; is this just a matter of style?
Upvotes: 1
Views: 366
Reputation: 309841
The form
x = set(i for i in data)
is shorthand for:
x = set((i for i in data))
This creates a generator expression which evaluates lazily. Compared to:
x = set([i for i in data])
which creates an entire list before passing it to set
From a performance standpoint, generator expressions allow for short-circuiting in certain functions (all
and any
come to mind) and takes less memory as you don't need to store the extra list -- In some cases this can be very significant.
If you actually are going to iterate over the entire iterable data
, and memory isn't a problem for you, I've found that typically the list-comprehension is slightly faster then the equivalent generator expression*.
temp $ python -m timeit 'set(i for i in "xyzzfoobarbaz")'
100000 loops, best of 3: 3.55 usec per loop
temp $ python -m timeit 'set([i for i in "xyzzfoobarbaz"])'
100000 loops, best of 3: 3.42 usec per loop
Note that if you're curious about speed -- Your fastest bet will probably be just:
x = set(data)
proof:
temp $ python -m timeit 'set("xyzzfoobarbaz")'
1000000 loops, best of 3: 1.83 usec per loop
*Cpython only -- I don't know how Jython or pypy optimize this stuff.
Upvotes: 6
Reputation: 170055
The []
syntax creates a list, which is discarded immediatley after the set is created. So you are increasing the memory footprint of the program.
The generator syntax avoids that.
Upvotes: 3