Reputation: 7304
I am new to Python and am having difficulty understanding the image shape conversion in Python.
In Python code, image I has I.shape
ipdb> I.shape
(720, 1280, 3)
Running this command in Python converts the I's shape
and stored into h5_image
h5_image = np.transpose(I, (2,0,1)).reshape(data_shape)
Where data_shape is:
ipdb> p data_shape
(1, 3, 720, 1280)
What is OpenCV's similar function that does the same output?
In (1, 3, 720, 1280)
, what does 1 mean?
What is the difference between (3, 720, 1280)
and (720, 1280, 3)
?
Upvotes: 0
Views: 620
Reputation: 3058
You can look on image (I
) in python/numpy as a matrix with N dimensions.
I.shape --> (rows, cols)
I.shape --> (rows, cols, 3)
I.shape --> (rows, cols, 4)
These are the common way to keep image data, but of course you can keep it in any way you like, as long as you know how to read it. For example, you can keep it as one long vector in 1 dimension, and keep also the image width and height, so you know how to read it into 2D format.
For your more specific questions:
(1, 3, 720, 1280)
only means you have an additional degenerate dimension. To access each pixel you will have to write I[1,channel,row,col]
. The 1
is unnecessary, and it is not a common way to hold an image array. Why do you want to do this? Do you want to save in a specific format? (HDF5?)(3, 720, 1280)
, to get the red channel you need to write: red = I[0,:,:]
. While in the case of (720, 1280, 3)
you need to write: red = I[:,:,0]
(This is more common).*There are some performance issues which depend on the actual arrangment of the image data in your memory, but I don't think you need to care of this right now.
Upvotes: 3