Reputation: 1
I want to make an artificial intelligence application that predicts a handwritten number drawn on a canvas based on the MNIST dataset and prints it on the screen. Almost everything is ready for this. My algorithm running successfully in these code but I have a problem. The figure I draw on the canvas is not centered for estimation. I mean, the system cannot detect and predict the number I drew on the corner of the canvas.
These are my codes:
The code I loaded the model
self.loaded_model = pickle.load(open('svm_model.pkl','rb'))
Used As Canvas Container
self.label = QtWidgets.QLabel()
canvas = QtGui.QPixmap(300, 300)
canvas.fill(QtGui.QColor("black"))
self.label.setPixmap(canvas)
self.last_x, self.last_y = None, None
Predict button's function
def predict(self):
self.label.setAlignment(QtCore.Qt.AlignCenter)
s = self.label.pixmap().toImage().bits().asarray(300 * 300 * 4)
arr = np.frombuffer(s, dtype=np.uint8).reshape((300, 300, 4))
arr = np.array(ImageOps.grayscale(Image.fromarray(arr).resize((28,28), Image.ANTIALIAS)))
arr = (arr/255.0).reshape(1, -1)
self.prediction.setText('Prediction: '+str(self.loaded_model.predict(arr)))
These are visualizations of my problem:
I can append all of my codes if you want. I think cause of this problem is not centering the image to be predicted. Because guesses correctly when I draw in the middle of the canvas. I can't find of header in the stackoverflow about my problem. There are similar ones but no matter how hard I tried I couldn't set them to my code.
Upvotes: 0
Views: 440
Reputation: 26886
One way of doing this is by leveraging SciPy's ND-Image function for finding objects in a labelled array scipy.ndimage.find_objects()
.
Essentially, you would:
import numpy as np
import scipy as sp
import scipy.ndimage
def recenter(arr):
slicing = sp.ndimage.find_objects(arr != 0, max_label=1)[0]
center_slicing = tuple(
slice((dim - sl.stop + sl.start) // 2, (dim + sl.stop - sl.start) // 2)
for sl, dim in zip(slicing, arr.shape))
result = np.zeros_like(arr)
result[center_slicing] = arr[slicing]
return result
To test it actually works:
arr = np.array([
[0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 2],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
])
tgt = np.array([
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 2, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
])
res = recenter(arr)
np.allclose(res, tgt)
# True
Note that in case of mixed parity of the original array shape and the object slicing, the object cannot be placed exactly in the middle and it will be closer to the lower edge along that dimension (when printing these are the top-left edges for 2D arrays).
Upvotes: 1