Reputation: 1047
I hope someone can help me out with this. I'd like to measure sorting algorithms. Here's how I currently do it:
M = 1000 # number of executions
N = [1000, 2000, 4000, 16000] # size of the list
L = [100, 1000, 2000,16000] # max element of the list
# timing:
print 'Number of executions: %i' % (M)
print '-'*80
print '\tL\N\t|\t%i\t|\t%i\t|\t%i\t|\t%i' % (N[0], N[1], N[2], N[3])
print '-'*80
for l in L:
print '\t%i\t' % l,
for n in N:
t = 0
for m in xrange(M):
A = [random.randint(0,l-1) for r in xrange(n)] # generates an n long random list
t0 = time.clock()
pass # sort function call goes here
t1 = time.clock()
t += (t1-t0)
print '|\t%0.3f\t' % ((t*1000.0)/M ), # avg time
print
print '-'*80
This empty test takes about 4 minutes. I would appreciate any advice on how to make it faster.
Cheers
Edit: After Rafe Kettler's hint, I came up with this:
def sorting(LST):
pass
if __name__ == "__main__" :
M = 1000
N = [1000, 2000, 4000, 16000]
L = [100, 1000, 2000,16000]
print 'Number of executions: %i' % (M)
print '-'*80
print '\tL\N\t|\t%i\t|\t%i\t|\t%i\t|\t%i' % (N[0], N[1], N[2], N[3])
print '-'*80
for l in L:
print '\t%i\t' % l,
for n in N:
#------------------------
t = timeit.Timer('sorting([random.randint(0,l-1) for r in xrange(n)])', 'from __main__ import sorting, n, l, random')
#------------------------
print '|\t%0.3f\t' % (t.timeit(M)/M ), # avg time
print
print '-'*80
Unfortunately it become slower. What am I doing wrong?
Upvotes: 3
Views: 621
Reputation: 1364
Not quite answering the timimg question, but you can use the random module in numpy package to generate large array of random numbers very effeciently:
>>> from numpy import random
>>> l = 100; n = 16000
>>> random.randint(0,l-1,n)
Adapting OP's script, below is comparision of total time using numpy.random v.s. stock random module:
numpy.random
number of executions: 1000
--------------------------------------------------------------------------------
L\N | 1000 | 2000 | 4000 | 16000
--------------------------------------------------------------------------------
100 | 0.022 | 0.043 | 0.084 | 0.332
1000 | 0.016 | 0.031 | 0.059 | 0.231
2000 | 0.016 | 0.030 | 0.059 | 0.231
16000 | 0.016 | 0.030 | 0.059 | 0.231
--------------------------------------------------------------------------------
random
Number of executions: 1000
--------------------------------------------------------------------------------
L\N | 1000 | 2000 | 4000 | 16000
--------------------------------------------------------------------------------
100 | 2.152 | 4.271 | 8.649 | 34.007
1000 | 2.264 | 4.501 | 8.762 | 34.956
2000 | 2.202 | 4.412 | 8.743 | 34.818
16000 | 2.205 | 4.398 | 8.735 | 34.823
--------------------------------------------------------------------------------
Upvotes: 0
Reputation: 14581
Generate the random numbers once. Put them in a shelve or pickle file and then read them out when you need to run a test.
Upvotes: 0
Reputation: 308500
Creating random numbers is a time-consuming task. You're creating 4*1000*(1000+2000+4000+16000) of them. The simplest possible test case takes over 7 minutes on my system:
>>> t=timeit.Timer('random.randint(0,15999)','import random')
>>> t.timeit(4*1000*(1000+2000+4000+16000))
447.08869618904077
As I said in a comment, it's extremely important to exclude the timings for creating your test data from the timings of the algorithm under test.
Upvotes: 1
Reputation: 7112
It is possible for you replace this code:
A = [random.randint(0,l-1) for r in xrange(n)]
With generator? eg
def A(n):
for r in xrange(n):
yield random.randint(0,l-1)
I think, most of time in your empty test is random list generation
Upvotes: 2
Reputation: 76965
timeit. Best way to time in Python, period. Refactor your algorithms into functions and use timeit
to test the execution time.
Upvotes: 12