R.T. Canterbury
R.T. Canterbury

Reputation: 52

Python string matching to a string in a list on indices and return a value in the list based on most matches

I am passing a string argument into a function that I want to match as closely as possible to an item in a list. The layer of complexity added here is that the argument and items in the list will all eventually be split by "." and I need to see if the strings/values match by index.

I am not strong with recursion so I am attempting this with enumerate().

It's easier to show what I want before showing what I did:

hero_episode_list = ["Batman.1.1.2.3.5.6", "Batman.1.2.1.1", "Batman.1.3.1.4", 
                     "Batman.1.1.2.3.4", "Batman.1.2.2.1.3", "Superman.1.2.1.3.4", "Superman.1.3.2.1", "Superman.1.1.2.4"]


def get_hero_match(hero):
  if hero in hero_episode_list:  # If the argument matches an item in the list EXACTLY, return the item.
    return hero
  else:
    hero_split = hero.split(".")
    for ep in hero_episode_list:
      ep_split = ep.split(".")
      print(f"Function arg: {hero_split}")
      print(f"List    item: {ep_split}")
      print("------")

get_hero_match("Batman.1.2.1.1.3")

Output:

Function arg: ['Batman', '1', '2', '1', '1', '3']
List    item: ['Batman', '1', '1', '2', '3', '5', '6']
------
Function arg: ['Batman', '1', '2', '1', '1', '3']
List    item: ['Batman', '1', '2', '1', '1']  <- should return this since it has the most matches
------
Function arg: ['Batman', '1', '2', '1', '1', '3']
List    item: ['Batman', '1', '3', '1', '4']
------
Function arg: ['Batman', '1', '2', '1', '1', '3']
List    item: ['Batman', '1', '1', '2', '3', '4']
------
Function arg: ['Batman', '1', '2', '1', '1', '3']
List    item: ['Batman', '1', '2', '2', '1', '3']
------
Function arg: ['Batman', '1', '2', '1', '1', '3']
List    item: ['Superman', '1', '2', '1', '3', '4']
------
Function arg: ['Batman', '1', '2', '1', '1', '3']
List    item: ['Superman', '1', '3', '2', '1']
------
Function arg: ['Batman', '1', '2', '1', '1', '3']
List    item: ['Superman', '1', '1', '2', '4']
------

Here's what I am trying:

hero_episode_list = ["Batman.1.1.2.3.5.6", "Batman.1.2.1.1", "Batman.1.3.1.4", 
                     "Batman.1.1.2.3.4", "Batman.1.2.2.1.3", "Superman.1.2.1.3.4", "Superman.1.3.2.1",
                     "Superman.1.1.2.4"]

def get_hero_match(hero):
  if hero in hero_episode_list:  # If the argument matches an item in the list EXACTLY, return the item.
    return hero
  else:
    hero_split = hero.split(".")
    ep_split = [ep.split(".") for ep in hero_episode_list]
    for item in ep_split:
      for count, (h, e) in enumerate(zip(hero_split, item)):
        if h == e:
          print(count, h, e)

get_hero_match("Batman.1.2.1.1.3")

Output:

0 Batman Batman
1 1 1
0 Batman Batman  <-- should return this one
1 1 1
2 2 2
3 1 1
4 1 1
0 Batman Batman
1 1 1
3 1 1
0 Batman Batman
1 1 1
0 Batman Batman <- don't know what this one's doing
1 1 1
2 2 2
4 1 1
5 3 3
1 1 1
2 2 2
3 1 1
1 1 1
4 1 1
1 1 1

How can I get the highest matched 'count' value using enumeration? I want to use that to then return the value in the list since it has the most matches by index.

Upvotes: 1

Views: 330

Answers (1)

TLeo
TLeo

Reputation: 115

I made it, so it can only match, if the hero is the same, eg.: Superman. If there are no matches, then the match will be the first element of the list.

def get_hero_match(hero):
    if hero in hero_episode_list:
        return hero
    else:
        hero_split = hero.split(".")
        #Set the default best match to the first element of the hero list
        max_matches = [0, hero_episode_list[0]]
        for ep in hero_episode_list:
            ep_split = ep.split(".")
            
            #If the hero starts with something other than the input, then skip to the next one
            if ep_split[0] != hero_split[0]: continue

            number_of_matches = 0
            #Loop through both of the separated versions
            for ep_part, hero_part in zip(ep_split, hero_split):
                print(ep_part, hero_part)
                #In case of a match, add True (1) to the current number of matches                     
                if ep_part == hero_part:
                    number_of_matches += 1
                else: break
            else:
                if number_of_matches > max_matches[0]: #If the number of matches bigger than the previous then set it to the max
                    max_matches = [number_of_matches, ep]
            print(f"Function arg: {hero_split}")
            print(f"List    item: {ep_split}")
            print("------")
    print(max_matches[1]) #Print out the best match

Upvotes: 1

Related Questions