Saad
Saad

Reputation: 107

Hash of integers in python

I understand that hash of an immutable object is an integer representation of that object which is unique within the process's lifetime.

Hash of an integer object is the same as the value held by the integer. For example,

>>> int(1000).__hash__()
1000

But when the integer grows big enough, above principle breaks after a certain threshold as it seems. Its value seems to be rolled with in some limit.

>>> int(10000000000000000).__hash__()
10000000000000000
>>> int(100000000000000000).__hash__()
100000000000000000
>>> int(1000000000000000000).__hash__()
1000000000000000000
>>> int(10000000000000000000).__hash__()
776627963145224196

Two questions:

  1. What is the limit? What is the integer-space that hash table covers?
  2. How is the hash value calculated for an integer exceeding the above limit?

System information:

Linux lap-0179 5.13.0-44-generic #49~20.04.1-Ubuntu SMP Wed May 18 18:44:28 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Python interpreter:

Python 3.8.10 (default, Mar 15 2022, 12:22:08) 
[GCC 9.4.0] on linux

Upvotes: 4

Views: 1171

Answers (1)

norok2
norok2

Reputation: 26886

While this is machine and implementation dependent, for CPython on 64-bit machines, the hash() for a non-negative integer n is computed as n % k with k = (2 ** 61 - 1) (= 2305843009213693951) hence values between 0 and k - 1 are left as is.

This is empirically evidenced here:

k = 2 ** 61 - 1
for i in range(k - 2, k + 2):
    print(i, hash(i), i % k)
# 2305843009213693949 2305843009213693949
# 2305843009213693950 2305843009213693950
# 2305843009213693951 0
# 2305843009213693952 1

For the complete ruleset, see the documentation.

Upvotes: 2

Related Questions