MichaelR
MichaelR

Reputation: 999

Find and replace words in string

I know this question has been asked many times in different versions, but i did not find anything that helped me.

I have a list of words:

arr = ["id",...]

And I have several strings:

str = "my_id"
str1 = "Id_number"
str2 = "my_id_rocks"
str3 = "my_idea"

Im trying to find the word "id" in the strings and turn in to upper case. but if id is a part of a word in the string, then do nothing. Meaning after I apply the function Ill get :

str = "my_ID"
str1 = "ID_number"
str2 = "my_ID_rocks"
str3 = "my_idea"

I cannot assume anything about the strings, some letters can be upper case, some lower case.

So far this is what i have, but this also capitalizes idea => IDea which i dont want:

def words_to_upper(str):
    words = ["id"]
    for word in words:
        if word in str.lower():
            replace_word = re.compile(re.escape(word), re.IGNORECASE)
            str = replace_word.sub(word.upper(), str)
            break
    return str

Thank you.

Upvotes: 0

Views: 69

Answers (2)

Jasper
Jasper

Reputation: 3947

I added the [regexp] tag, because you need them to do this (or at last, it's what they are made for, so you'd better use them instead of reinventing the wheel).

The keyword you need are lookahead and lookbehind, see at the bottom of this section

import re

teststrs = ["my_id", "Id_number", "my_id_rocks", "my_idea"]

replace_with_upper = "id"

def toUpper(match):
    return match.group(1).upper()

for test_me in teststrs:
    test_me = re.sub("(?<![a-z])({})(?![a-z])".format(replace_with_upper), toUpper, test_me, flags=re.IGNORECASE)
    print(test_me)

The (?<![a-z]) is a negative lookbehind: "don't match if this pattern matches on the left". So if there is a letter on the left of "id", don't match. This doesn't happen with your examples, but I think you want this behavior as well.

The (?![a-z]) is a negative lookahead: "don't match if this pattern matches on the right". This prevents the regex from matching "my_idea", because the lookahead sees the "e" .

Upvotes: 1

bobble bubble
bobble bubble

Reputation: 18490

You can use lookarounds to check that there is no alnum before and after id

(?i)(?<![a-z0-9])id(?![a-z0-9])

See demo at regex101

Upvotes: 2

Related Questions