Reputation: 373
I was trying to use the point
function to invert and normalize a PIL image
to 1; but, I am not getting the desired results!
What I have tried is this (don't know what is wrong?)
data = data.point(lambda p: 1 if p < 127 else 0 ) # threshold, invert and normalize to 1
For example, when trying
print(np.array(data).max())
prints True
.
However, converting the PIL Image
to numpy
array and then inverting it worked, as follows:
data = np.array(data.getdata(),
np.uint8).reshape(data.size[1], data.size[0], 1)
maxG = data.max() # correcting the values of folder e, they do not match the other folders
data = ( (maxG - data)/maxG ).astype('uint8')
tsfm = transforms.ToPILImage() #import torchvision.transforms as transforms
data = tsfm(data)
I have tried both methods in a word recognition experiment, only the second one worked for me. Sadly, using the point
function led to incorrect results.
Not sure what is the difference?
NB. The second method is extremely slow, so if this conversion could be done using the point
function, that would be a great deal of help.
Upvotes: 1
Views: 660
Reputation: 207738
You are confusing "normalisation" and "thresholding".
With "thresholding", you make all values above or equal to a threshold equal to some high number and all values below the threshold equal to some low number. The only possible outcome for each pixel is either the high number or the low number - nothing in between. On a typical 8-bit image, you would threshold at 127, and all pixels would end up as either 0 or 255.
With "normalisation", you constrain all values in the image between some new upper and some new lower limit - or any value between. So the outcome is a bunch of new pixels greater than the low limit and lower than the high limit, but evenly interpolated between the two. On a typical image, you might normalise all values to the range 0-255 and each pixel could end up with any value in that range.
Upvotes: 1