Daniel
Daniel

Reputation: 23

Python: Detect word from string and also find its location

I am new to python and want to make a simple program that prints back your name and preposition in James Bond style.

So if the name contains any prepositions, such as 'Van', 'Von', 'De' or 'Di', I want the program to print it as:

{Preposition} {LastName}, {FirstName} {Preposition} {LastName} *edited

For this, I understand we need a list of the users name and of the prepositions.

a = [user input separated with the .split function]
b = [list of prepositions]

In order to find the instance of a preposition in the name, I found that the code bellow could be used:

if any(x in a for x in b):

However, I encountered a problem when trying to print the name, seeing as the preposition could be any from the aforementioned (list b). I can not find a way to print without knowing this and its location in the string. First I thought the .index function could be used, but it seems to only be capable of searching for one word, not several as needed here. The closest I can get is:

name_split.index('preposition1') # works
name_split.index('preposition1', 'preposition2', etc.) # does not work

So what I'm asking is if there is a way to check if any of the words from a list (b) is used in an inputted text, and also get the location of said word.

Hope I was able to explain it properly, and that someone could lend me some assistance. In advance; thank you.

Upvotes: 2

Views: 171

Answers (3)

0dminnimda
0dminnimda

Reputation: 1463

I can't think of a better way of doing this than using the for loop:

pattern = "{1} {2}, {0} {1} {2}"
prepositions = ['van', 'von', 'de', 'di']

# (optional) 'lower' so that we don't have to consider cases like 'vAn'
name = "Vincent van Gogh".lower()
index = -1  # by default, we believe that we did not find anything
for preposition in prepositions:
    # 'find' is the same as 'index', but returns -1 if the substring is not found
    index = name.find(preposition)
    if index != -1:
        break  # found an entry

if index == -1:
    print("Not found")
else:
    print("The index is", index,
          "and the preposition is", preposition)
    print(pattern.format(*name.split()))

Outputs:

The index is 8 and the preposition is van
van gogh, vincent van gogh

If you want to iterate through the list of names then you could do that:

pattern = ...
prepositions = ...
names = ...

for name in names:
    name = name.lower()
    ... # the rest is the same

New version with the second type of prepositions ("Jr.", "Sr."):

def check_prepositions(name, prepositions):
    index = -1

    for preposition in prepositions:
        index = name.find(preposition)
        if index != -1:
            break  # found an entry

    return index, preposition


patterns = [
    "{1} {2}, {0} {1} {2}",
    "{1}, {0} {1} {2}"
]

all_prepositions = [
    ['van', 'von', 'de', 'di'],
    ["Jr.", "Sr."]
]

names = ["Vincent van Gogh", "Robert Downey Jr.", "Steve"]

for name in names:
    for pattern, prepositions in zip(patterns, all_prepositions):
        index, preposition = check_prepositions(name, prepositions)

        if index != -1:
            print("The index is", index,
                  "and the preposition is", preposition)
            print(pattern.format(*name.split()))
            break

    if index == -1:
        print("Not found, name:", name)

Outputs:

The index is 8 and the preposition is van
van Gogh, Vincent van Gogh
The index is 14 and the preposition is Jr.
Downey, Robert Downey Jr.
Not found, name: Steve

Upvotes: 1

Mark Moretto
Mark Moretto

Reputation: 2348

Different approach using regular expressions (I know).

import re

def process_input(string: str) -> str:
    string = string.strip()
    # Preset some values.
    ln, fn, prep = "", "", ""

    # if the string is blank, return it
    # Otherwise, continue.
    if len(string) > 0:

        # Search for possible delimiter.
        res = re.search(r"([^a-z0-9-'\. ]+)", string, flags = re.I)

        # If delimiter found...
        if res:
            delim = res.group(0)

            # Split names by delimiter and strip whitespace.
            ln, fn, *err = [s.strip() for s in re.split(delim, string)]
     
        else:
            # Split on whitespace
            names = [s.strip() for s in re.split(r"\s+", string)]

            # If first, preposition, last exist or first and last exist.
            # update variables.
            # Otherwise, raise ValueError.
            if len(names) == 3:
                fn, prep, ln = names
            elif len(names) == 2:
                fn, ln = names
            else:
                raise ValueError("First and last name required.")

        # Check for whitespace in last name variable.
        ws_res = re.search(r"\s+", ln)
        if ws_res:
            # Split last name if found.
            prep, ln, *err = re.split(r"\s+", ln)
        
        # Create array of known names.
        output = [f"{ln},", fn, ln]

        # Insert prep if it contains a value
        # This is simply a formatting thing.
        if len(prep) > 0:
            output.insert(2, prep)

        # Output formatted string.
        return " ".join(output)

    return string


if __name__ == "__main__":
    # Loop until q called or a max run amout is reached.
    re_run = True
    max_runs = 10

    while re_run or max_runs > 0:
        print("Please enter your full name\nor press [q] to exit:")
        user_input = input()
        if user_input:
            if user_input.lower().strip() == "q":
                re_run = False
                break

            result = process_input(user_input)
            print("\n" + result + "\n\n")
            max_runs -= 1

Upvotes: 1

pho
pho

Reputation: 25490

Why does it matter what preposition you find in the name? You aren't printing it anywhere, all you really care about is the last name, and the rest of the name. Instead of looking for a preposition, you could simply split from the right using rsplit(), and ask for a maxsplit of 1. For example:

>>> "Vincent van Gogh".rsplit(" ", 1)
['Vincent van', 'Gogh']

>>> "James Bond".rsplit(" ", 1)
['James', 'Bond']

Then, you could simply print those values as you see fit.

fname, lname = input_name.rsplit(" ", 1)
print(f"{lname}, {fname} {lname}")

With input_name = "Vincent van Gogh", this prints Gogh, Vincent van Gogh. With input_name = "James Bond", you get Bond, James Bond.

This has the added advantage that it also works if people input a middle name / initial.

>> fname, lname = "Samuel L. Jackson".rsplit(" ", 1)
>> print(f"{lname}, {fname} {lname}")
Jackson, Samuel L. Jackson

Note that there are lots of oddities in how people write names, so it's worth taking a look at Falsehoods Programmers Believe About Names

Upvotes: 1

Related Questions