InKwon Park
InKwon Park

Reputation: 41

Python: how to get a 4-byte sized integer from a 4-byte byte array?

Here is a simple Python (version 3.4) code I've written to get a 32bit sized integer (int type I would assume) from an array of 4 bytes:

import binascii
import socket   
import struct
import array
import pickle
import ctypes
import numpy
import sys

float_val = 1.0 + 0.005
print(float_val)

packed = struct.pack('f', float_val)
print(len(packed)) 

tempint2 = struct.unpack(">I", packed)[0]
tempint3 = struct.unpack_from(">I", packed)[0]
tempint4 = int.from_bytes(packed, byteorder='big', signed=False)

print(sys.getsizeof(tempint2))
print(tempint2)
print(sys.getsizeof(tempint3))
print(tempint3)
print(sys.getsizeof(tempint4))
print(tempint4)

However, none of the attempts (tempint2/tempint3/tempint4) gives the value I expected (4-byte size integer). Somehow, the size is all 18 bytes (sys.getsizeof() function result). Can you tell me how to get the expected answer (4-byte or 32bit size integer)?

Upvotes: 1

Views: 2669

Answers (1)

3442
3442

Reputation: 8576

First of all, due to Python's... ahem... "magic", sys.getsizeof() won't return the length of a list, but the sizeof the whole datastructure as represented internally by the Python interpreter.

Now, the answer (for integers) is simply... (for all combinations of Python 2.x/Python 3.x and 32-bit/64-bit):

from math import ceil, floor, log

def minimumAmountOfBytesToHoldTheStuff(x):
    # Avoid math domain errors
    if x < 0:
        x = ~x

    # Avoid more math domain erros
    if x == 0:
        x = 1

    return int(ceil((floor(log(x, 2)) + 1 ) / 8))

def powersOfTwo():
    x = 1
    while True:
        yield x
        x *= 2

def minimumAmountOfBytesToHoldTheStuffOnRealMachines(x):
    bytes = minimumAmountOfBytesToHoldTheStuff(x)
    for power in powersOfTwo():
        if bytes <= power:
            return power

print(minimumAmountOfBytesToHoldTheStuffOnRealMachines(tempint))

Note: It appears that log(x, 2) breaks for x >= pow(2, 48) - 1, and so does the whole algorithm. This is probably an issue from the C library/the stupid floating-point accurracy errors, because log(n, x) in Python is translated into log(n) / log(x) in C.

Edit: This one is an optimized version for Python 3.x that is independent of bot floating-point and logarithmic operations, and thus is accurate on all situations...

from math import ceil

def minimumAmountOfBytesToHoldTheStuff(x):
    # Avoid math domain errors
    if x < 0:
        x = ~x

    # Avoid more math domain erros
    if x == 0:
        x = 1

    return int(ceil(x.bit_length() / 8))

The other functions are the same.

I hope this has led some light on you!

Upvotes: 1

Related Questions