Sammy25
Sammy25

Reputation: 45

Python "range" resource consumption

I wrote the following script

Basically, I'm just learning Python for Machine Learning and wanted to check how really computationally intensive tasks would perform. I observe that for 10**8 iterations, Python takes up a lot of RAM (around 3.8 GB) and also a lot of CPU time (just froze my system)

I want to know if there is any way to limit the time/memory consumption either through code or some global settings

Script -

initial_start = time.clock()
for i in range(9):
 start = time.clock()
 for j in range(10**i):
  pass
 stop = time.clock()
 print 'Looping exp(',i,') times takes', stop - start, 'seconds'
final_stop = time.clock()
print 'Overall program time is',final_stop - initial_start,'seconds'

Upvotes: 3

Views: 4951

Answers (5)

user4815162342
user4815162342

Reputation: 155056

If you're considering Python for machine learning, take a look at numpy. Its philosophy is to implement all "inner loops" (matrix operations, linear algebra) in optimized C, and to use Python to manipulate input and output and to manage high-level algorithms - sort of like Matlab that uses Python. That gives you the best of both worlds: ease and readability of Python, and speed of C.

To get back to your question, benchmarking numpy operations will give you a more realistic assessment of Python's performances for machine learning.

Upvotes: 1

Ryan Haining
Ryan Haining

Reputation: 36822

look at this question: How to limit the heap size?

To address your script, the timeit module measures the time it takes to perform an action more accurately

>>> import timeit
>>> for i in range(9):
...     print timeit.timeit(stmt='pass', number=10**i)
...
0.0
0.0
0.0
0.0
0.0
0.015625
0.0625
0.468752861023
2.98439407349

Your example is taking most of its time dealing with the gigantic lists of numbers you're putting it memory. xrange instead of range will help fix that issue but you're still using a terrible benchmark. the loop is going to execute over and over and not actually do anything, so the cpu is busy checking the condition and entering the loop.

As you can see, creating these lists is taking the majority of the time here

>>> timeit.timeit(stmt='range(10**7)', number=1)
0.71875405311584473
>>> timeit.timeit(stmt='for i in range(10**7): pass', number=1)
1.093757152557373

Upvotes: 2

Steven Rumbalski
Steven Rumbalski

Reputation: 45541

In Python 2, range creates a list. Use xrange instead. For a more detailed explanation see Should you always favor xrange() over range()?

Note that a no-op for loop is a very poor benchmark that tells you pretty much nothing about Python.

Also note, as per gnibbler's comment, Python 3's range is works like Python 2's xrange.

Upvotes: 8

Sachin
Sachin

Reputation: 925

As regards cpu, you have a for loop running for billions of iterations without any sort of sleep or pause inbetween, so no wonder the process hogs the cpu completely ( at least on a single core computer).

Upvotes: 0

Rostyslav Dzinko
Rostyslav Dzinko

Reputation: 40765

Python takes RAM because you're creating a very large list of 10 ** 8 length with range function. That's where iterators become useful.

Use xrange instead of range.

It will work the same way as range do but instead of creating that large list in memory, xrange will just calculate inner index (incrementing it's value by 1 each iteration).

Upvotes: 2

Related Questions