Reputation: 1571
How can I speed up reading 12 bit little endian packed data in Python?
The following code is based on https://stackoverflow.com/a/37798391/11687201, works but it takes far too long.
import bitstring
import numpy as np
# byte_string read from file contains 12 bit little endian packed image data
# b'\xAB\xCD\xEF' -> pixel 1 = 0x0DAB, pixel 2 = Ox0EFC
# width, height equals image with height read
image = np.empty(width*height, np.uint16)
ic = 0
ii = np.empty(width*height, np.uint16)
for oo in range(0,len(byte_string)-2,3):
aa = bitstring.BitString(byte_string[oo:oo+3])
aa.byteswap()
ii[ic+1], ii[ic] = aa.unpack('uint:12,uint:12')
ic=ic+2
Upvotes: 1
Views: 2151
Reputation: 1571
I found a solution, that executes much faster on my system than the solution mentioned above https://stackoverflow.com/a/65851364/11687201 which already was a great improvement (2 seconds instead of 2 minutes using the code in the question). Loading one of my image files using the code below takes approximately 45 milliseconds, instead of approximately 2 seconds with the above mentioned solution.
import numpy as np
import math
image = np.frombuffer(byte_string, np.uint8)
num_bytes = math.ceil((width*height)*1.5)
num_3b = math.ceil(num_bytes / 3)
last = num_3b * 3
image = image[:last]
image = image.reshape(-1,3)
image = np.hstack( (image, np.zeros((image.shape[0],1), dtype=np.uint8)) )
image.dtype='<u4' # 'u' for unsigned int
image = np.hstack( (image, np.zeros((image.shape[0],1), dtype=np.uint8)) )
image[:,1] = (image[:,0] >> 12) & 0xfff
image[:,0] = image[:,0] & 0xfff
image = image.astype(np.uint16)
image = image.reshape(height, width)
Upvotes: 3
Reputation: 240649
This should work a bit better:
for oo in range(0,len(byte_string)-2,3):
(word,) = struct.unpack('<L', byte_string[oo:oo+3] + b'\x00')
ii[ic+1], ii[ic] = (word >> 12) & 0xfff, word & 0xfff
ic += 2
It's very similar, but instead of using bitstring
which is quite slow, it uses a single call to struct.unpack
to extract 24 bits at a time (padding with zeroes so that it can be read as a long) and then does some bit masking to extract the two different 12-bit parts.
Upvotes: 3