ChrisD
ChrisD

Reputation: 674

Iterating over a large list

I have a list in the range [465868129, 988379794] both inclusive. When I use the following code I get a Memory Error. What can I do?

r = [465868129, 988379794]
list = [x for x in xrange(r[0], r[1]+1)]

Upvotes: 1

Views: 507

Answers (2)

avenet
avenet

Reputation: 3043

Unless you have a very good reason to store the list items in a list, iterate over the generator instead, that way Python won't need to allocate a lot of memory (causing your Memory Error) to create that list:

init, end = (465868129, 988379794)
items = xrange(init, end + 1)

for item in items:
    #Do something with item

To count squares on an arbitrary range consider the following formula:

import math

number_of_squares = int(math.sqrt(end) - math.sqrt(init)) + 
                    op(is_perfect_square(init), is_perfect_square(end))

The is_perfect_square(n) is another problem on its own, so check this post if interested.

The op is used to adjust the number of squares when the init and end of the intervals init (or/and/neither) end are perfect squares. So we need a function with the following characteristics:

  • Both numbers are perfect squares: Eg: 25,64 => 8 - 5 = 3 (and there are 4 squares on that range). (it should sum 1 more)
  • End is a perfect square: Eg: 26,64 => 8 - 5 = 3 (There are 3 squares on that range). (it is correct => it should sum 0)
  • Init is a perfect square: Eg: 25,65 => 8 - 5 = 3 (There are 4 squares on that range). (it should sum 1 more)
  • None of the numbers are primes: Eg: 26, 65 => 8 - 5 = 3 (There are 3 squares on that range) (it is correct => it should sum 0)

So we need an operator with the following characteristics, based on the past examples:

  • 1 op 1 = 1 (Both numbers are perfect squares)
  • 0 op 1 = 0 (End is a perfect square)
  • 1 op 0 = 1 (Init is a perfect square)
  • 0 op 0 = 0 (None of the numbers are perfect squares)

Note that the max function almost fulfils our needs, but it fails on the second case max(0,1) = 1 and it should be 0.

So, looks like the result only depends on the first operator: if it's one, the result is 1, on the other hand if it's 0, it returns 0.

So, it's easy to write the function with that in mind:

import math

number_of_squares = int(math.sqrt(end) - math.sqrt(init)) + 
                    int(is_perfect_square(init))

Thanks to @kojiro, we have this approach (having a similar idea), which is easier to read:

from math import sqrt, floor, ceil

number_of_squares = 1 + floor(sqrt(end)) - ceil(sqrt(init))

Upvotes: 1

John Kugelman
John Kugelman

Reputation: 361595

You could iterate over the xrange directly instead of creating a list.

for x in xrange(r[0], r[1] + 1):
    ...

But iterating over such a large range is a very, very slow way to find squares. The fact that you run out of memory should alert you that a different approach is needed.

A much better way would be to take the square roots of each end point and then iterate between the square roots. Each integer between the square roots, when squared, would give you one of the numbers you're searching for.

In fact, if you're clever enough, you can generate all the squares with a single list comprehension and avoid an explicit for loop entirely.

Upvotes: 4

Related Questions