Reputation: 3257
I have handwritten digit in a box and I'm trying to just pull the handwritten digit out. The size is 208 x 117, so that's about 24k pixels.
I want to take advantage of the fact that I have color, so I decided to use a clustering algorithm to isolate the color of the digit, then extract just those pixels. The problem is that I need to get this down to 0.01s per digit, and now sklearn.cluster.KMeans
takes about 0.15s. I tried resizing the image, but that takes time in itself, and I also tried using a threshold to just get the colored pixels and ignore the light background (gets me down to 10k pixels), but that didn't speed things up much.
Any ideas?
Upvotes: 0
Views: 478
Reputation: 3257
I found a way. Turns out you get a massive speedup by reducing sample size. So I just randomly sampled a quarter of the pixels and fed that into the clustering function. I got a 50x speedup.
Upvotes: 2