Reputation: 8703
Ok, so I have tried reading a couple of sources: One, Two, Three. The first source clearly points that using generator expressions conserves memory, and makes things faster. But what happens when they get nested? Here is an example, and I'm hoping that someone will be able to help me understand it.
In [1]: def nested():
[(ix, '-'.join(str(x) for x in hours)) for ix in table.index.get_level_values(0)]
%timeit nested
Out [1]: 10000000 loops, best of 3: 33.5 ns per loop
In [2]: def not_nested():
hr = '-'.join(str(x) for x in hours) #hr = 7-8
[(ix, hr) for ix in table.index.get_level_values(0)]
%timeit not_nested
Out [2]: 10000000 loops, best of 3: 38.7 ns per loop
In the above example, hours
is a list 2 elements long, while the number of indexes in table
at level 0 is 32.
If I were to run the two functions within my head, I would assume that in function nested
the second half of the tuple ('-'.join(str(x) for x in hours
) will be called as many times as the 'outer' (ix
) loop runs (that is as many times as there are indexes in table
). However, in function not_nested
the second half of the tuple gets initialized once (stored in hr
), and will not be run each time the second line executes.
First, am I correct in thinking that that is how Python works? If I am then can anyone explain how is that the run time of the nested function shorter than the one not nested?
Begin Edit/Solution
Apparently, I made a mistake in calling the function, as the two answers enlighten. I re-ran timeit
with the correct function call, and got expected results:
In [1]: %timeit nested()
Out [1]: 10000 loops, best of 3: 55.5 µs per loop
In [2]: %timeit not_nested()
Out [2]: 100000 loops, best of 3: 4.53 µs per loop
Ultimately, making the runtime when not nested a fraction of the runtime when nested. Thanks to all for answering and clearing this out!
End Edit
Upvotes: 0
Views: 97
Reputation: 31524
You are timing how long it takes to retrieve the function from the namespace. The namespace is a dictionary and in order to search something in a dictionary, python has to invoke the hash
function on the key (the function name).
Longer keys take more time to produce the hash, so the delta you are seeing is because nested
is 6 letters, while not_nested
is 10 letters. Try to switch the names and magically the slower function will become faster according to your test. In order to time a function you have to invoke it:
%timeit nested()
%timeit not_nested()
Upvotes: 1
Reputation: 279335
You aren't measuring the time to call the function, you're measuring the time to look up the name of the function in the current scope.
Try %timeit nested()
.
As for why there's a difference in time -- it's tempting to say "because the second name is longer, so the string comparison in the hash lookup takes longer", but I don't think that's actually correct because (I think) the strings involved will have been interned by CPython anyway. Certainly, I can't get Python to consistently report longer times for looking up longer function names in my own tests.
If you're interested in the lookup time, start by running it more times and see how consistent the numbers are.
Upvotes: 2