Coddy
Coddy

Reputation: 566

Numpy array bigger than the total size of images it is made up of

I am trying to convert a directory of RGB images to a numpy array but the resulting array is way bigger than the sum of sizes of all the images put together. What is going on here?

Upvotes: 1

Views: 1185

Answers (2)

Mark Setchell
Mark Setchell

Reputation: 207385

The short answer is compression, and in the case of PNG, filtering. The longer answer is...

Imagine you make a screen-grab of your lovely 5K screen of 5120x2880 pixels. You would expect that to be this size and shape in Numpy:

import numpy as np

# Mock a screen-grab on 5120x2880 screen
grab = np.zeros((2880,5120,3), np.uint8)

Now let's get the size of it:

print(grab.nbytes)

and it is 44236800, i.e. 44MB

Now let's save that as a PNG:

from PIL import Image
Image.fromarray(grab).save('result.png')

and check its size:

-rw-r--r--@ 1 mark  staff  43010 18 Mar 09:14 result.png

so, it's 43kB, or 1000x smaller because of compression and filtering.


I admit that the case above is extreme, because a blank screen is very compressible, but in fact, it could have been worse. Imagine the screen was a high-depth 16-bit affair, and it captured an alpha channel as well. In that case, you'd have:

grab = np.zeros((2880,5120,4), np.uint16)
print(grab.nbytes)

Prints:

117964800      i.e. 117MB!

Upvotes: 2

Guilherme Poleto
Guilherme Poleto

Reputation: 367

That's because image files are usually compressed, which means that the stored data will be smaller than the original file containing all pixel data, when you open a image using PIL, for example, you'll get access to all RGB values of all pixels, so, there's more data 'cus it's uncompressed.

Upvotes: 2

Related Questions