btrif
btrif

Reputation: 317

Python large list error

I did the following program to generate a list of consecutive numbers. However, the computations seem to fail for more than 70.000 elements in the list. I tried using Pycharm IDE and also the python console. The result is the same. I'm using the Python 3.4.1 32-bit version. What should I do ? Which can be the cause ?

from pylab import *

a = 100000                    # the number of elements from my_array
my_array = [i for i in range(a)]

missing_number = randint(a)
print('Generate a Random number: ', missing_number)
my_array.remove(missing_number)   # We remove the random generated number from my_array
print('The number of elements of the list is: ', len(my_array))     #Length of  my_array
print('the sum of the list is :',sum(my_array))             # Sum

sum02 = (a *(a-1)/2)        #  The sum of consecutive numbers
print('The complete sum of the consecutive numbers:',int(sum02),'\n')
print('And the missing number is:', int(sum02) - sum(my_array))

I will reproduce the result that I have locally on my machine :

C:\Util\Python34\python.exe "find_missing_number_2.py"

Generate a Random number:  15019
The number of elements of the list is:  99999
the sum of the list is : 704967685
The complete sum of the consecutive numbers: 4999950000
And the missing number is: 4294982315
Process finished with exit code 0 

It doesn't result an error. It's just doing wrong calculations as you can see if you compare the two variables: missing_number with the one resulted from int(sum02)-sum(my_array)

Upvotes: 3

Views: 355

Answers (4)

btrif
btrif

Reputation: 317

I found which was the cause. The cause was as mentioned by user2313067 the fact that I imported all the pylab module and some of its functions overlap over some others python built-in functions. Bad practice indeed to import all the module especially if you are using only a function. So the solution is in this case :

from pylab import randint 

and the code works even for very large lists (a = 10000000). My result is now correct :

C:\Util\Python34\python.exe "find_missing_number_2.py"
Generate a Random number:  3632972
The number of elements of the list is:  9999999
the sum of the list is : 49999991367028
The complete sum of the consecutive numbers: 49999995000000 

And the missing number is: 3632972

Process finished with exit code 0

Upvotes: 0

jcfollower
jcfollower

Reputation: 3158

How about creating your own function for summing consecutive numbers ...

def consecutive_sum(first, last):
    half = (first + last) / 2.0
    return half * (last - first + 1)

Then you don't need a list of numbers and you can just get one random number (e.g. n) and ...

sum1 = consecutive_sum(1, n-1)
sum2 = consecutive_sum(n+1, max_num)

Upvotes: 0

user2313067
user2313067

Reputation: 603

from pylab import * does a from numpy import *. This includes the numpy.sum function which explicitly says that Arithmetic is modular when using integer types, and no error is raised on overflow.

To avoid this used the builtin sum function, either as shown by Reut Sharabani, or by not doing from pylab import *, which is a bad practice anyway. It can replace any built-in functions, without you noticing. As far as I know, it replaces at least sum and all at the moment, but I'm not sure that's all, and you can't be sure that it won't replace others in the future.

Upvotes: 3

Reut Sharabani
Reut Sharabani

Reputation: 31339

If your problem is the size of the list, try using xrange:

# my_array = [i for i in range(a)]
my_array = xrange(a)

Also, my_array = [i for i in range(a)] is the same as my_array = range(a) if you're using python 2.X

Edit: To use built-in sum (arbitrary percision):

__builtins__.sum(a)

Upvotes: 0

Related Questions