Reputation: 43
I have implemented a python code for calculating PSNR values of Y channel in YCrCb channel. I get the PSNR values to be around 35.7dB(for a pair of images)
import cv2, main
import sys
i1 = cv2.imread(sys.argv[1])
i2 = cv2.imread(sys.argv[2])
i1= cv2.cvtColor(i1, cv2.COLOR_BGR2YCrCb)
i2= cv2.cvtColor(i2, cv2.COLOR_BGR2YCrCb)
print(main.psnr(i1[:,:,0], i2[:,:,0]))
In main psnr is defined as:
def psnr(target, ref):
import cv2
target_data = numpy.array(target, dtype=numpy.float64)
ref_data = numpy.array(ref,dtype=numpy.float64)
diff = ref_data - target_data
print(diff.shape)
diff = diff.flatten('C')
rmse = math.sqrt(numpy.mean(diff ** 2.))
return 20 * math.log10(255 / rmse)
I got an online implementation(from the paper I am referring to) in matlab I get PSNR values to be around 37.06dB (for the same pair of images)
function psnr=compute_psnr(im1,im2)
if size(im1, 3) == 3,
im1 = rgb2ycbcr(im1);
im1 = im1(:, :, 1);
end
if size(im2, 3) == 3,
im2 = rgb2ycbcr(im2);
im2 = im2(:, :, 1);
end
imdff = double(im1) - double(im2);
imdff = imdff(:);
rmse = sqrt(mean(imdff.^2));
psnr = 20*log10(255/rmse)
Can this error be due to errors introduced by numpy or accuracy numpy seems to achieve?
Upvotes: 1
Views: 5244
Reputation: 22215
Your two conversion functions seem to produce wildly different results:
which explains the discrepancy.
Octave does mention there are several YCbCr standards:
The formula used for the conversion is dependent on two constants,
KB and KR which can be specified individually, or according to
existing standards:
"601" (default)
According to the ITU-R BT.601 (formerly CCIR 601) standard.
Its values of KB and KR are 0.114 and 0.299 respectively.
"709" (default)
According to the ITU-R BT.709 standard. Its values of KB and
KR are 0.0722 and 0.2116 respectively.
Maybe the python version is using a different standard? (or maybe it's a BGR vs RGB issue?). In any case, that's where the discrepancy lies, it doesn't seem to be a matter of numpy precision (when those functions are tested separately with identical inputs, they produce the same results).
EDIT:
According to these:
python (or rather, the opencv library) seems to be outputting the 'analog' (unscaled) version, whereas matlab / octave is outputting the 'digital' (scaled) version.
This is confirmed:
# Python
RGB = numpy.concatenate(
( numpy.array([[[0], [255], [255], [0], [0], [0], [255]]], dtype=numpy.uint8),
numpy.array([[[0], [0], [255], [255], [255], [0], [255]]], dtype=numpy.uint8),
numpy.array([[[0], [0], [0], [0], [255], [255], [255]]], dtype=numpy.uint8)),
axis=2)
RGB2Y = cv2.cvtColor(RGB, cv2.COLOR_BGR2YCrCb)
print(RGB2Y)
[[[ 0 128 128] [ 29 107 255] [179 0 171] [150 21 43] [226 149 1] [ 76 255 85] [255 128 128]]]
% Octave
pkg load image;
RGB = uint8 (cat (3, [0, 255, 255, 0, 0, 0, 255], ...
[0, 0, 255, 255, 255, 0, 255], ...
[0, 0, 0, 0, 255, 255, 255]));
RGB2Y = rgb2ycbcr(RGB)
RGB2Y = ans(:,:,1) = 16 81 210 145 170 41 235 ans(:,:,2) = 128 90 16 54 166 240 128 ans(:,:,3) = 128 240 146 34 16 110 128
Therefore, if it's a matter of achieving consistency, I would scale the python results using the conversion formula from analog to digital, mentioned in the wikipedia page above i.e.:
If it's a question of "which version is the most appropriate one for the calculation of PSNR", I don't know, but from what I'm reading in the links above, my money would be on the matlab / octave implementation.
Upvotes: 3