Reputation: 433
I use cv.imread
to read a png file in python. When I then use cv.imwrite
function to immediately save the image i then find that the colours in the image have changed slightly. I am trying to perform character recognition on this image and the OCR performs far less well on the image in python than the original image.
The first image is the original, and the second is the saved one with OpenCV.
We can see that the green has changed slightly and whilst this does not seem important it affects the OCR and I therefore imagine that other changes are happening to the png. Does anyone know why this might be and how i can resolve this.
The code is as follows
img = cv2.imread('file.png')
cv2.imwrite('out.png', img)
When I run file.png in tesseract for character recognition I get great results but when I run out.png in tesseract far less words get recognised correctly.
Upvotes: 2
Views: 4159
Reputation: 21203
When you have a .png
image file you ought to read as a .png
file.
I downloaded your image and did some analysis myself.
First, I read the image as you did:
img = cv2.imread('file.png')
img.shape
returns (446, 864, 3)
i.e an image with 3 channels.
Next I read the same image using cv2.IMREAD_UNCHANGED
:
img = cv2.imread('file.png', cv2.IMREAD_UNCHANGED)
img.shape
returns (446, 864, 4)
i.e an image with 4 channels.
.png
files have an additional transparency channel. So next you come accross a .png
file read it using cv2.IMREAD_UNCHANGED
flag
UPDATE:
Enlisting the various ways to read an image:
for var in dir(cv2):
if var.startswith('IMREAD'):
print(var)
returns:
IMREAD_ANYCOLOR
IMREAD_ANYDEPTH
IMREAD_COLOR
IMREAD_GRAYSCALE
IMREAD_LOAD_GDAL
IMREAD_UNCHANGED
Upvotes: 4