Tim
Tim

Reputation: 39

Replace regex in string with value from dict, while retrieving the index

Given a dict and a string:

my_dict = {"X":"xxx", "Y":"yyy"}
my_str = "A[xxx]BC[yyy]"  # editted to comment below

I need to create two different strings:

> print(result_1)
'ABC'
> print(result_2)
1|X|3|Y

where result_1 is my_str without the square brackets, and result_2 is the index of that place in the string without brackets.

So far, I am able to find all square brackets with:

vals = re.findall(r'\[([^]]*)\]', my_str)
for val in vals:
    print(val)

I know that I can find the index with str.index() or str.find() as explained here, and I also know that I can use re.sub() to replace values, but I need to combine these methods with the lookup in the dict to obtain two different strings. Can anyone help me or put me on the right path?

Upvotes: 1

Views: 100

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626758

You may use a solution like

import re
my_dict = {"X":"xxx", "Y":"yyy"}
my_str = "A[xxx]BC[yyy]"

def get_key_by_value(dictionary, value):
    for key, val in dictionary.items():
        if val == value:
            return key
    return value  # If no key has been found

rx = re.compile(r'\[([^][]*)]')
result_1 = rx.sub('', my_str)
result_2_arr = []
m = rx.search(my_str)
tmp = my_str
while m:
    result_2_arr.append("{}|{}".format(m.start(), get_key_by_value(my_dict, m.group(1))))
    tmp = "".join([tmp[:m.start()], tmp[m.end():]])
    m = rx.search(tmp)

print ( result_1 )
print ( "|".join(result_2_arr) )

See the Python demo

Output:

ABC
1|X|3|Y

The result_1 is the result of removing [...] substrings from the input string.

The result_2 is formed by:

  • Looking for a \[([^][]*)] match in a string
  • If there is a match, the start index of the match is taken, the dictionary key is searched for in the dictionary and in case it is there, the key is returned, else, the value is returned, and the match is removed from the string and the next regex search is done on the modified string. Then, the results are "joined" with |.

Upvotes: 1

Related Questions