Reputation: 12743
My code has become really slow after some last changes I have done. A searching task takes 102 seconds instead of 2-3 seconds.
I've tried using the profile
class in order to find the limiting function, and here is the output:
>>> import WebParser
>>>
>>> w = WebParser.LinksGrabber
>>>
>>> import cProfile
>>> cProfile.run("w.search('nicki minaj', 15)")
50326 function calls in 102.745 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 102.745 102.745 <string>:1(<module>)
6 0.000 0.000 0.000 0.000 Config.py:110(__getattr__)
1 0.000 0.000 102.745 102.745 LinksGrabber.py:427(search)
5 0.000 0.000 0.002 0.000 Queue.py:107(put)
911 0.040 0.000 102.726 0.113 Queue.py:150(get)
..................................
}
6836 0.022 0.000 0.022 0.000 {min}
917 0.009 0.000 0.009 0.000 {thread.allocate_lock}
3 0.000 0.000 0.000 0.000 {thread.get_ident}
3 0.000 0.000 0.000 0.000 {thread.start_new_thread}
6835 100.458 0.015 100.458 0.015 {time.sleep}
11346 0.035 0.000 0.035 0.000 {time.time}
It shows that a time.sleep
code is waiting for 100.458s
, but I can't find the code piece in my WebParser.LinksGrabber
class.
How can I use profile
to get more information about the slow piece of code?
Upvotes: 1
Views: 599
Reputation: 41950
I have a piece of boilerplate code I add in to programs I need to profile, which I can easily enable or disable by changing the PROFILE
variable to True
or False
...
#!/usr/bin/env python
# Should we profile the code?
PROFILE = True
# What proportion of the profile data should we keep?
PROFILE_LIMIT = 0.5 # 50%
# Where to store the raw profile data
PROFILE_DAT = 'profile.dat'
# Where to store the formatted profile data
PROFILE_TXT = 'profile.txt'
# Main code starts here
import time
def do_something():
for i in range(5):
time.sleep(0.1)
def do_something_else():
for i in range(10):
time.sleep(0.1)
def main():
do_something()
do_something_else()
if __name__ == '__main__':
if PROFILE:
import os, cProfile, pstats
cProfile.runctx('main()', globals(), locals(), PROFILE_DAT)
f = open(PROFILE_TXT, 'wb')
for sort_key in 'time', 'cumulative':
stats = pstats.Stats(PROFILE_DAT, stream=f)
stats.sort_stats(sort_key)
stats.print_stats(PROFILE_LIMIT)
stats.strip_dirs()
stats.sort_stats(sort_key)
if sort_key == 'time':
stats.print_callers(PROFILE_LIMIT)
else:
stats.print_callees(PROFILE_LIMIT)
f.close()
os.unlink(PROFILE_DAT)
else:
main()
...which creates a text file profile.txt
containing...
...which looks like this...
[...]
Ordered by: internal time
List reduced from 7 to 4 due to restriction <0.5>
ncalls tottime percall cumtime percall filename:lineno(function)
15 1.627 0.108 1.627 0.108 {time.sleep}
1 0.000 0.000 1.092 1.092 foo.py:23(do_something_else)
1 0.000 0.000 0.536 0.536 foo.py:19(do_something)
1 0.000 0.000 1.628 1.628 foo.py:27(main)
[...]
Function was called by...
ncalls tottime cumtime
{time.sleep} <- 5 0.535 0.535 foo.py:19(do_something)
10 1.092 1.092 foo.py:23(do_something_else)
foo.py:23(do_something_else) <- 1 0.000 1.092 foo.py:27(main)
foo.py:19(do_something) <- 1 0.000 0.536 foo.py:27(main)
foo.py:27(main) <- 1 0.000 1.628 <string>:1(<module>)
[...]
Ordered by: cumulative time
List reduced from 7 to 4 due to restriction <0.5>
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 1.628 1.628 <string>:1(<module>)
1 0.000 0.000 1.628 1.628 foo.py:27(main)
15 1.627 0.108 1.627 0.108 {time.sleep}
1 0.000 0.000 1.092 1.092 foo.py:23(do_something_else)
[...]
Function called...
ncalls tottime cumtime
<string>:1(<module>) -> 1 0.000 1.628 foo.py:27(main)
foo.py:27(main) -> 1 0.000 0.536 foo.py:19(do_something)
1 0.000 1.092 foo.py:23(do_something_else)
{time.sleep} ->
foo.py:23(do_something_else) -> 1 0.000 0.000 {range}
10 1.092 1.092 {time.sleep}
...which is usually sufficient to track down where the code needs improving.
Upvotes: 2