Fahim
Fahim

Reputation: 348

How Python is working with a number bigger than the 64-bit unsigned integer limit?

From this question (How big can a 64bit signed integer be?), I learned that the biggest possible number to work with on a 64-bit machine is 2^64-1, which is 92,233,720,368,547,758,070. That means, even if I add 1 to it, it should return inf. But it's not showing inf. Here's what I'm observing:

>>> max = sys.maxsize
>>> format(max, ',')
'9,223,372,036,854,775,807'
>>> a = max * 10
>>> format(a, ',')
'92,233,720,368,547,758,070'
>>> a / max
10.0

Even if for some reason 92,233,720,368,547,758,070 is not the biggest number for Python, then what is the use of sys.maxsize?

Secondly, shouldn't a 64-bit number take 64-bit memory space? Why both max and a are taking 36 bytes?

>>> sys.getsizeof(max)
36
>>> sys.getsizeof(a)
36

Can anyone please describe both of the confusion?

Upvotes: 5

Views: 4937

Answers (1)

MisterMiyagi
MisterMiyagi

Reputation: 51999

Integers as Digit-Arrays

Python 3 (CPython) integers are not native machine integers. Logically, each integer consists of its sign and an absolute number in base 1073741824 (30-bit) or 32768 (15-bit) [*] - the latter being a variable-size array of unsigned integers. To store larger numbers, an additional "digit" is added to the array.

>>> sys.getsizeof(0)          # largest  0-digit number
24
>>> sys.getsizeof(1)          # smallest 1-digit number
28
>>> sys.getsizeof(2**30 - 1)  # largest  1-digit number
28
>>> sys.getsizeof(2**30)      # smallest 2-digit number
32
>>> sys.getsizeof(2**60 - 1)  # largest  2-digit number
32

Loosely speaking, this is the same mechanism as one adds digits when writing out decimal numbers – using 1-digit is enough up to 9, 2-digit up to 99, and so on. Likewise, as long as the computer has memory to "add a digit" a Python integer of larger size can be defined.

[*] The digits are 30-bit/15-bit instead of 32-bit/16-bit because this better fits some algorithms. For example, long_pow() requires a size divisible by 5.

Objects Header for Integers

Practically, integers are also objects – meaning they hold metadata such as type and reference count - which also takes up space. In CPython, an int consists of:

  • reference counter of Py_ssize_t
  • pointer to the type of PyTypeObject*
  • digit count of Py_ssize_t
  • variable array of digits of digit[]

where the first three are the structure of every variable size object. The sign is encoded inside the digit count.

On a 64-bit machine, both Py_ssize_t and PyTypeObject* are 8 byte in size – giving the "0-digit integer" 0 a size of 3*8 bytes or 24 bytes.

>>> sys.getsizeof(0)          # largest  0-digit number
24

So what is sys.maxsize?

The meaning of sys.maxsize is not the maximum integer size, but the maximum container size:

>>> len(range(sys.maxsize))    # this is fine
9223372036854775807
>>> len(range(sys.maxsize+1))  # this is one too much
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: range() result has too many items

This is a direct result of sys.maxsize expressing the maximum value of Py_ssize_t, the type used by the CPython runtime to represent and address memory. While this might seem like an arbitrary restriction, it is actually significantly more than what computers can address.

Upvotes: 7

Related Questions