Gregg Lind
Gregg Lind

Reputation: 21312

Python, len, and size of ints

So, cPython (2.4) has some interesting behaviour when the length of something gets near to 1<<32 (the size of an int).

r = xrange(1<<30)
assert len(r) == 1<<30

is fine, but:

r = xrange(1<<32)
assert len(r) == 1<<32
ValueError: xrange object size cannot be reported`__len__() should return 0 <= outcome

Alex's wowrange has this behaviour as well. wowrange(1<<32).l is fine, but len(wowrange(1<<32)) is bad. I'm guessing there is some floating point behaviour (being read as negative) action going on here.

  1. What exactly is happening here? (this is pretty well-solved below!)
  2. How can I get around it? Longs?

(My specific application is random.sample(xrange(1<<32),ABUNCH)) if people want to tackle that question directly!)

Upvotes: 5

Views: 4687

Answers (3)

Sapph
Sapph

Reputation: 6208

You'll find that

xrange(1 << 31 - 1)

is the last one that behaves as you want. This is because the maximum signed (32-bit) integer is 2^31 - 1.

1 << 32 is not a positive signed 32-bit integer (Python's int datatype), so that's why you're getting that error.

In Python 2.6, I can't even do xrange(1 << 32) or xrange(1 << 31) without getting an error, much less len on the result.

Edit If you want a little more detail...

1 << 31 represents the number 0x80000000 which in 2's complement representation is the lowest representable negative number (-1 * 2^31) for a 32-bit int. So yes, due to the bit-wise representation of the numbers you're working with, it's actually becoming negative.

For a 32-bit 2's complement number, 0x7FFFFFFF is the highest representable integer (2^31 - 1) before you "overflow" into negative numbers.

Further reading, if you're interested.

Note that when you see something like 2147483648L in the prompt, the "L" at the end signifies that it's now being represented as a "long integer" (64 bits, usually, I can't make any promises on how Python handles it because I haven't read up on it).

Upvotes: 5

SingleNegationElimination
SingleNegationElimination

Reputation: 156308

cPython assumes that lists fit in memory. This extends to objects that behave like lists, such as xrange. essentially, the len function expects the __len__ method to return something that is convertable to size_t, which won't happen if the number of logical elements is too large, even if those elements don't actually exist in memory.

Upvotes: 12

Anon.
Anon.

Reputation: 60043

1<<32, when treated as a signed integer, is negative.

Upvotes: 1

Related Questions