Reputation: 541
I am calculating the Structural Similarity Index between two images. I don't understand what the dimensionality should be. Both images (reference and target) are RGB images.
If I shape my image as (256*256, 3), I obtain:
ref = Image.open('path1').convert("RGB")
ref_array = np.array(ref).reshape(256*256, 3)
print(ref_array.shape) # (65536, 3)
img = Image.open('path2').convert("RGB")
img_array = np.array(img).reshape(256*256, 3)
print(img_array.shape) # (65536, 3)
ssim = compare_ssim(ref_array,img_array,multichannel=True,data_range=255)
The result is 0.0786
.
On the other hand, if I reshape to (256, 256, 3):
ref = Image.open('path1').convert("RGB")
ref_array = np.array(ref)
print(ref_array.shape) # (256, 256, 3)
img = Image.open('path2').convert("RGB")
img_array = np.array(img)
print(img_array.shape) # (256, 256, 3)
ssim = compare_ssim(ref_array, img_array, multichannel=True, data_range=255)
The result is 0.0583
Which of the two results is correct and why? The documentation does not say anything about it, since it's probably a conceptual problem.
Upvotes: 3
Views: 2297
Reputation: 1163
The second one is correct, assuming you have a square shaped image and not a really long thin one.
SSIM takes neighbouring pixels into account (for luminance and chrominance masking and identifying structures). Images can be any shape, but if you tell the algorithm your shape is 256*256 by 1 pixel in shape, then the vertical structures will not be taken into account.
Upvotes: 3