Reputation: 1941
I have been fighting with a program and I have read many info about use images in python but I did not get my program works.
I am doing a program that recognice a card. I have a "database" of all the cards each one in a different jpg file. So that tried its to compare the card that we want to know with all the possible cards. Then the card that are more similar would be the card that I am looking for. I have tried with several different codes but no one do his job correctlly.
def get_card(image1path):
from PIL import Image
import math
import os
import operator
__docstring__ = "compare two image files"
h1 = Image.open(image1path).resize((40,55)).histogram()
best=99999999
for root,dirs,files in os.walk("cards"):
for file in [f for f in files]:
list=(os.path.join(root, file))
h2 = Image.open(list).resize((40,55)).histogram()
rms = math.sqrt(reduce(operator.add, map(lambda a,b: (a-b)**2, h1, h2))/len(h1))
print "RMS = "+str(rms)+" and the picture is:"+list
if rms<best:
best=rms
card=(os.path.join(file)).split(".")[0]
return card
image1path="C:\\8d.jpg" #eight of diamonds.
card=get_card(image1path)
print card
The problem is that it dont work fine because after the comparation of the RMS of each card with all of them, there are some wrong cards that get best RMS punctuation. So the recogniced card is not the eight of diamonds like t must be.
How must i do it?? If you need that I explain it in other words just say it.
Thank you very much
Upvotes: 1
Views: 1024
Reputation: 19221
By looking at the images that you are comparing, you actually don't want to use metrics such as RMSE
and others. The reason is because all the images are similar in an "RMSE-sense", or even for more refined metrics that aren't concerned about the basic relations present in the image. Here are some examples given by yourself:
The basic relations in your case are: color (which also distinguishes between spades, hearts, diamonds, and clubs cards), and shape measures. So, by detecting the color of the card, the search space is reduced and all that is left is discerning between the numbers at the top of the card. Together with the amount of connected components and euler number the search is further restricted. Now what is left is distinguishing: "9" from "6", "4", queen, or "A"; "3" from "J", "2", "5", or "7"; "8" and "10" are solved, the former due to euler number and the later due to its number of connected components (this all assuming the cards are unique, otherwise you proceed and find the most similar card). The simplest thing to do here, which will likely fail if you add more considerations to your problem, is calculating the Hausdorff distance between each shape remaining.
Here is a simplistic implementation that considers these points and works for all the given input. It expects an image and a directory to look for images to compare. Each step can be improved.
import sys
import numpy
from scipy.ndimage import morphology, label, find_objects
from PIL import Image
COLORS = range(4)
RED, GREEN, BLUE, BLACK = COLORS
def card_color(img):
im = img.load()
width, height = img.size
black, blue, green, red = 0, 0, 0, 0
for x in xrange(width):
for y in xrange(height):
r, g, b = im[x, y]
if r > 200 and g > 200 and b > 200:
# "white", ignore
continue
if r > 200 and g < 100 and b < 100:
red += 1
elif r < 100 and g < 100 and b > 200:
blue += 1
elif r < 50 and g < 50 and b < 50:
black += 1
elif r < 100 and g > 120 and b < 50: # dark green
green += 1
return max(zip((black, blue, green, red), COLORS))
def euler_number(img, conn=4):
im = img.load()
width, height = img.size
c1, c2, c3 = 0, 0, 0
for x in xrange(width - 1):
for y in xrange(height - 1):
s = (im[x,y] + im[x+1,y] + im[x,y+1] + im[x+1,y+1]) / 255
if s == 1:
c1 += 1
elif s == 2:
if (im[x+1,y] and im[x,y+1]) or (im[x,y] and im[x+1,y+1]):
c3 += 1
elif s == 3:
c2 += 1
if conn == 4:
return (c1 - c2 + 2 * c3) / 4
else: # 8
return (c1 - c2 - 2 * c3) / 4
def carefully_binarize(img, color):
if color == BLACK:
img = img.convert('L')
else:
img = img.split()[color]
width, height = img.size
im = numpy.empty((height + 2, width + 2), dtype=numpy.uint8) # Padding
im.fill(255)
im[1:-1, 1:-1] = numpy.array(img)
threshold = im.mean() - im.std()
im[im <= threshold] = 1
im[im > threshold] = 0
# Discard small components.
lbl, ncc = label(im)
for i in xrange(1, ncc + 1):
py, px = numpy.nonzero(lbl == i)
if len(py) < 30:
im[lbl == i] = 0
return Image.fromarray(im * 255)
def discard_bottom(img, k=0.5):
width, height = img.size
im = numpy.array(img)
limit = height * k
lbl, ncc = label(im)
for i, oslice in enumerate(find_objects(lbl)):
srow, scol = oslice
if srow.stop > limit:
ncc -= 1
im[srow.start:srow.stop, scol.start:scol.stop] = 0
return Image.fromarray(im), ncc
def signature(img):
# Assumption: a single connected component is present now.
im = numpy.array(img)
im = morphology.binary_fill_holes(im)
im = morphology.binary_dilation(im) - im
py, px = numpy.nonzero(im)
return Image.fromarray(im.astype(numpy.uint8)*255), zip(py, px)
def hausdorff(a, b):
dist = 0
for ai in a:
mindist = float('inf')
for bi in b:
chess = max(abs(ai[0]-bi[0]), abs(ai[1]-bi[1]))
if chess < mindist:
mindist = chess
if mindist > dist:
dist = mindist
return dist
img1 = Image.open(sys.argv[1]).convert('RGB')
dirpath = sys.argv[2]
img1_color = card_color(img1)[1]
img1 = carefully_binarize(img1, img1_color)
img1_top, img1_top_ncc = discard_bottom(img1)
img1_top_en = euler_number(img1_top)
feature = [img1_color, img1_top_ncc, img1_top_en]
match = []
for fname in os.listdir(dirpath):
try:
img2 = Image.open(os.path.join(dirpath, fname)).convert('RGB')
except IOError:
print "Ignoring", fname
continue
if card_color(img2)[1] != feature[0]:
continue
img2 = carefully_binarize(img2, feature[0])
img2_top, ncc = discard_bottom(img2)
if ncc != feature[1]:
continue
en = euler_number(img2_top)
if en != feature[2]:
continue
match.append((img2_top, os.path.join(dirpath, fname)))
if len(match) == 1:
print "Here is your best match:", match[0][1]
else:
img1_sig, sig1 = signature(img1_top)
best_match = float('inf'), None
for img2, fname in match:
img2_sig, sig2 = signature(img2)
dist = hausdorff(sig1, sig2)
if dist < best_match[0]:
best_match = dist, fname
print "Best match:", best_match[1]
Upvotes: 3