Reputation: 136
I am trying to match names by using the first, second, and last names, either in the correct order or not, using all of them or not. So far I've got this code and it sort of works, but I think it's not the right way of doing it. Do you know another way of doing this?
The names in the data set look like this:
name = 'DAVID SCOTT MUSTAIN'
What I want is to match that name if I search for 'DAVID'
, 'MUSTAIN SCOTT'
, 'SCOTT DAVID'
, etc..
The function I got so far looks like this:
def search_name(somename):
for full_name in some_dataset:
if set(somename.upper().split()).issubset(full_name.split()):
print('match:', full_name)
If I input something like 'DAV'
or 'SCOT'
, this will not match anything. How should I proceed in order to make a match even with incomplete names? If I split
the names into single letters it will match every name with those letters without checking the order of the letters.
Upvotes: 1
Views: 369
Reputation: 15376
You can use any
to check if any name in somename
is a subset of any of the names in full_name
def search_name(somename):
for full_name in some_dataset:
if any(n.upper() in fn for n in somename.split() for fn in full_name.split()):
print('match:', full_name)
And here is an example using sum
and a dictionary to pick the name with the most matches:
def search_name(somename):
matches = {}
for full_name in some_dataset:
matches[full_name] = sum(1 for n in somename.split() for fn in full_name.split() if n.upper() in fn)
best_matches = [k for k,v in matches.items() if v == max(matches.values()) if v != 0]
for match in best_matches:
print('match:', match)
I'm sure there are better ways to write this function but i'm very sleep deprived..
As for your second question perhaps you could print/return all the items in the best_matches
list?
Upvotes: 2
Reputation: 1140
I made a little function that use more statements
def search_name(name, toSearch, num = 2):
found = []
for word in name.split():
search = word[:num]
for letter in word[num:]:
search += letter
isThere = [data for data in toSearch.split() if data in search]
if isThere:
found += isThere
break
return len(toSearch.split()) == len(found)
name = 'DAVID SCOTT MUSTAIN'
if search_name(name,'TA'):
print(name)
else:
print('Nothing')
You want this ?
Upvotes: 1
Reputation: 4379
I might use
if full_name in somename and not set(full_name.split()) - set(someone.split())
to see if its a substring and it contains no extra short names.
Upvotes: 0