Reputation: 25
I'm working with large image datasets stored in a non-standard image format (.Tsm). Essentially it's a binary file with some headers at the start, very similar to FITS standard except stored in little-endian as opposed to FITS big-endian.
After reading the file header and formatting the metadata, I can read a single image using the following code
def __read_slice(self, file, img_num, dimensions):
"""Read a single image slice from .tsm file"""
pixel_range = self.metadata["pixel range"]
bytes_to_read = self.metadata["bytes to read"]
# position file pointer to correct byte
file.seek(self.HEADER_TOTAL_LEN + (bytes_to_read * img_num), 0)
all_bytes = file.read(bytes_to_read) # read image bytes
img = np.empty(len(pixels), dtype='uint16') # preallocate image vector
byte_idx = 0
for idx, pixel in enumerate(pixel_range):
img[idx] = (all_bytes[byte_idx + 1] << 8) + all_bytes[byte_idx]
byte_idx += 2
return np.reshape(img, (dimensions[1], dimensions[0])) # reshape array to correct dimensions
the trouble is the images can be very large (2048x2048) so even just loading in 20-30 frames for processing can take a significant amount of time. I'm new to python so i'm guessing the code here is pretty inefficient, especially the loop.
Is there a more efficient way to convert the byte data into 16bit integers?
Upvotes: 0
Views: 1282
Reputation: 95941
You can try:
img= np.frombuffer(all_bytes, dtype='uint16')
Example:
>>> np.frombuffer(b'\x01\x02\x03\x04', dtype='uint16')
array([ 513, 1027], dtype=uint16)
Upvotes: 2