Stiggo
Stiggo

Reputation: 1047

Python timing - There must be a better way!

I hope someone can help me out with this. I'd like to measure sorting algorithms. Here's how I currently do it:

M = 1000 # number of executions
N = [1000, 2000, 4000, 16000] # size of the list
L = [100, 1000, 2000,16000] # max element of the list

# timing:
print 'Number of executions: %i' % (M)
print '-'*80
print '\tL\N\t|\t%i\t|\t%i\t|\t%i\t|\t%i' % (N[0], N[1], N[2], N[3])
print '-'*80
for l in L:
    print '\t%i\t' % l,
    for n in N: 
        t = 0
        for m in xrange(M):
            A = [random.randint(0,l-1) for r in xrange(n)] # generates an n long random list
            t0 = time.clock()
            pass # sort function call goes here
            t1 = time.clock()
            t += (t1-t0)
        print '|\t%0.3f\t' % ((t*1000.0)/M ), # avg time
    print
print '-'*80

This empty test takes about 4 minutes. I would appreciate any advice on how to make it faster.

Cheers

Edit: After Rafe Kettler's hint, I came up with this:

def sorting(LST):
    pass

if __name__ == "__main__" :
    M = 1000
    N = [1000, 2000, 4000, 16000]
    L = [100, 1000, 2000,16000]

    print 'Number of executions: %i' % (M)
    print '-'*80
    print '\tL\N\t|\t%i\t|\t%i\t|\t%i\t|\t%i' % (N[0], N[1], N[2], N[3])
    print '-'*80
    for l in L:
        print '\t%i\t' % l,
        for n in N:
            #------------------------
            t = timeit.Timer('sorting([random.randint(0,l-1) for r in xrange(n)])', 'from __main__ import sorting, n, l, random')
            #------------------------
            print '|\t%0.3f\t' % (t.timeit(M)/M ), # avg time
        print
    print '-'*80

Unfortunately it become slower. What am I doing wrong?

Upvotes: 3

Views: 621

Answers (5)

LmW.
LmW.

Reputation: 1364

Not quite answering the timimg question, but you can use the random module in numpy package to generate large array of random numbers very effeciently:

>>> from numpy import random
>>> l = 100; n = 16000
>>> random.randint(0,l-1,n)

Adapting OP's script, below is comparision of total time using numpy.random v.s. stock random module:

numpy.random
number of executions: 1000
--------------------------------------------------------------------------------
        L\N     |       1000    |       2000    |       4000    |       16000
--------------------------------------------------------------------------------
        100     |       0.022   |       0.043   |       0.084   |       0.332
        1000    |       0.016   |       0.031   |       0.059   |       0.231
        2000    |       0.016   |       0.030   |       0.059   |       0.231
        16000   |       0.016   |       0.030   |       0.059   |       0.231
--------------------------------------------------------------------------------

random 
Number of executions: 1000
--------------------------------------------------------------------------------
        L\N     |       1000    |       2000    |       4000    |       16000
--------------------------------------------------------------------------------
        100     |       2.152   |       4.271   |       8.649   |       34.007
        1000    |       2.264   |       4.501   |       8.762   |       34.956
        2000    |       2.202   |       4.412   |       8.743   |       34.818
        16000   |       2.205   |       4.398   |       8.735   |       34.823
--------------------------------------------------------------------------------

Upvotes: 0

Martlark
Martlark

Reputation: 14581

Generate the random numbers once. Put them in a shelve or pickle file and then read them out when you need to run a test.

Upvotes: 0

Mark Ransom
Mark Ransom

Reputation: 308500

Creating random numbers is a time-consuming task. You're creating 4*1000*(1000+2000+4000+16000) of them. The simplest possible test case takes over 7 minutes on my system:

>>> t=timeit.Timer('random.randint(0,15999)','import random')
>>> t.timeit(4*1000*(1000+2000+4000+16000))
447.08869618904077

As I said in a comment, it's extremely important to exclude the timings for creating your test data from the timings of the algorithm under test.

Upvotes: 1

seriyPS
seriyPS

Reputation: 7112

It is possible for you replace this code:

A = [random.randint(0,l-1) for r in xrange(n)]

With generator? eg

def A(n):
    for r in xrange(n):
        yield random.randint(0,l-1)

I think, most of time in your empty test is random list generation

Upvotes: 2

Rafe Kettler
Rafe Kettler

Reputation: 76965

timeit. Best way to time in Python, period. Refactor your algorithms into functions and use timeit to test the execution time.

Upvotes: 12

Related Questions