Reputation: 61
I am working on a facial comparison app that will give me the closest n number of faces to my target face.
I have done this with dlib/face_recognition as it uses numpy arrays, however i am now trying to do the same thing with facenet/pytorch and running into an issue because it uses tensors.
I have created a database of embeddings and I am giving the function one picture to compare to them. What i would like is for it to sort the list from lowest distances to highest, and give me the lowest 5 results or so.
here is the code I am working on that is doing the comparison. at this point i am feeding it a photo and asking it to compare against the embedding database.
def face_match(img_path, data_path): # img_path= location of photo, data_path= location of data.pt
# getting embedding matrix of the given img
img_path = (os.getcwd()+'/1.jpg')
img = Image.open(img_path)
face = mtcnn(img) # returns cropped face and probability
emb = resnet(face.unsqueeze(0)).detach() # detech is to make required gradient false
saved_data = torch.load('data.pt') # loading data.pt file
embedding_list = saved_data[0] # getting embedding data
name_list = saved_data[1] # getting list of names
dist_list = [] # list of matched distances, minimum distance is used to identify the person
for idx, emb_db in enumerate(embedding_list):
dist = torch.dist(emb, emb_db)
dist_list.append(dist)
namestodistance = list(zip(name_list,dist_list))
print(namestodistance)
face_match('1.jpg', 'data.pt')
This results in giving me all the names and their distance from the target photo in alphabetical order of the names, in the form of (Adam Smith, tensor(1.2123432))
, Brian Smith, tensor(0.6545464)
etc. If the 'tensor' wasn't part of every entry I think it would be no problem to sort it. I don't quite understand why its being appended to the entries. I can cut this down to the best 5 by adding [0:5]
at the end of dist_list, but I can't figure out how to sort the list, I think the problem is the word tensor being in every entry.
I have tried
for idx, emb_db in enumerate(embedding_list): dist = torch.dist(emb, emb_db) sorteddist = torch.sort(dist)
but for whatever reason this only returns one distance value, and it isn't the smallest one.
idx_min = dist_list.index(min(dist_list))
, this works fine in giving me the lowest value and then matching it to a name using namelist[idx_min]
, therefore giving the best match, but I would like the best 5 matches in order as opposed to just the best match.
Anyone able to solve this ?
Upvotes: 1
Views: 836
Reputation: 959
Unfortunately I cannot test your code, but to me it seems like you are operation on a python list of tuples. You can sort that by using a key:
namestodistance = [('Alice', .1), ('Bob', .3), ('Carrie', .2)]
names_top = sorted(namestodistance, key=lambda x: x[1])
print(names_top[:2])
Of course you have to modify the anonymous function in key
to return a sortable value instead of e.g. a torch.tensor
.
This can be done by using key = lambda x: x[1].item()
.
Edit: To answer the question that crept up in the comments, we can refactor our code a little. Namely
namestodistance = list(map(lambda x: (x[0], x[1].item()), namestodistance)
names_top = sorted(namestodistance, key=lambda x: x[1])
print(names_top[:2])
Upvotes: 1