fransua
fransua

Reputation: 533

ipython / Question about the %timeit looping process

Here is my code in Google Colab :

myArray=[]

Then

%%timeit -n 2
myArray.append("1")

The resultat gives :

enter image description here

Which I don't really understand (I was expecting only two values for myArray)

Upvotes: 1

Views: 74

Answers (2)

Nicholas Obert
Nicholas Obert

Reputation: 1628

Timeit has two arguments you can pass to tell it how many times the code should be run: number (-n) and repeat (-r).

  • repeat tells timeit how many samples it should take
  • number specifies the number of times to repeat the code for each sample

Now, the default repeat value is 5. So, when number is 2 and repeat is 5, 2*5=10, which is the number of times the code is actually run and also the number of elements that get appended to the list.

To fix this you should also specify the repeat argument with -r.

Edit

For every sample you take (-r), you also run the setup code you may have passed to timeit. On the other hand, the number (-n) tells timeit how many times it should run your code for every sample. Your code is executed n times only after the setup, which is done r times, once for every sample.

From the timeit documentation:

timeit.timeit(stmt='pass', setup='pass', timer=<default timer>, number=1000000, globals=None)

This is the timeit function signature. As you can see, there is a setup parameter you can pass to specify any code that should be run prior to executing your code (which will be executed number times, equivalent to -n) in the sample.

Let's now compare it with the timeit.repeat function:

timeit.repeat(stmt='pass', setup='pass', timer=<default timer>, repeat=5, number=1000000, globals=None)

As you can see, there is an extra parameter here: repeat. Note that the default value of repeat (equivalent to -r) is 5, and this is why you get 10 elements appended to your list in your example.

Why should you use both -r and -n?

It's better to specify both the number of runs per sample, as well as the samples to take for comparability reasons. Think of every sample as you executing your Python script: Python has to load the script, perform some initial setup, and only then does it run your code.
You can also think of the number (-n) as the number of iterations in a for loop: the setup has already been done prior to running your code.

Here's a simplified Python-like pseudocode representation of what's happening when you use the timeit module:

def timeit(your_code, repeat, number, setup): 
    for r in range(repeat):
        perform_setup()
        for n in range(number):
            run(your_code)

Hope this helps, cheers.

Upvotes: 2

hpaulj
hpaulj

Reputation: 231540

timeit has loops and runs. You just specified the loops per run.

> In [80]: alist = []    
In [81]: %%timeit
    ...: alist.append(1)
    ...: 
    ...: 
146 ns ± 12.6 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [82]: len(alist)
Out[82]: 81111111

In [83]: alist=[]    
In [84]: %%timeit -n2 -r1
    ...: alist.append(1)
    ...: 
    ...: 
1.75 µs ± 0 ns per loop (mean ± std. dev. of 1 run, 2 loops each)

In [85]: len(alist)
Out[85]: 2

In [86]: alist=[]    
In [87]: %%timeit -n2
    ...: alist.append(1)
    ...: 
    ...: 
993 ns ± 129 ns per loop (mean ± std. dev. of 7 runs, 2 loops each)

In [88]: len(alist)
Out[88]: 14

Generally I try to setup a timeit so I don't care what the "result" is, since I want the times, not some sort of accumulated list or array.

For a fresh list each run:

In [89]: %%timeit -n2  -r10 alist=[]
    ...: alist.append(1)
    ...: 
    ...: 
The slowest run took 4.75 times longer than the fastest. This could mean that an intermediate result is being cached.
1.21 µs ± 889 ns per loop (mean ± std. dev. of 10 runs, 2 loops each)

Upvotes: 1

Related Questions