Reputation: 364
I would like to do some special re.sub input
string = "\"hope\" and \"love\" or \"passion\" and (\"luck\" or \"money\") "
word_list = ['hope', 'love', 'passion', 'money', 'luck']
the hoped output is
'0 and 1 or 2 and (4 or 3)
i try with
print(re.sub("\"([^\"]*)\"", stri.index(r'\g<1>') , string))
but it dosen't work
Upvotes: 0
Views: 514
Reputation: 92904
Use re.sub
function with replacement function as a second argument:
string = "\"hope\" and \"love\" or \"passion\" and (\"luck\" or \"money\") "
word_list = ['hope', 'love', 'passion', 'money', 'luck']
print(re.sub("\"([^\"]*)\"", lambda m:
str(word_list.index(m.group(1))) if m.group(1) in word_list else m.group(1), string))
The output:
0 and 1 or 2 and (4 or 3)
(keep in mind that there could be matches which are not in the word_list
list, e.g. ... (\"luck\" or \"money\") or \"compassion\"
)
re.sub(pattern, repl, string, count=0, flags=0)
... If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.
Upvotes: 1
Reputation: 107347
Without considering your word list, you can use itertools.count
in order to count the number of matches and a function as the second argument of the sub()
function that calls the next
of the counter for each match.
In [10]: from itertools import count
In [11]: c = count()
In [12]: re.sub(r'"([^"]+)"', lambda x: str(next(c)), string)
Out[12]: '0 and 1 or 2 and (3 or 4) '
If you want the indices to be based on the word's indices in word_list
as an efficient approach you can create a dictionary from words as the key and indices as the values then use a simple indexing to get the corresponding index within sub()
function:
In [29]: word_dict = {w: str(i) for i, w in enumerate(word_list)}
In [30]: re.sub(r'"([^"]+)"', lambda x: word_dict[x.group(1)], string)
Out[30]: '0 and 1 or 2 and (4 or 3) '
Note that you could use list.index
method in order to access to word's index for each word. But due to the fact that the complexity of list indexing is O(n) it's not as efficient as using a dictionary indexing which is O(1).
Upvotes: 0
Reputation: 48120
Alternatively (without re
), you may iterate over the word_list
using enumerate
and replace content of the string
using str.replace()
as:
my_string = "\"hope\" and \"love\" or \"passion\" and (\"luck\" or \"money\") "
word_list = ['hope', 'love', 'passion', 'money', 'luck']
for i, word in enumerate(word_list):
my_string = my_string.replace('"{}"'.format(word), str(i))
The final value hold by my_string
will be:
'0 and 1 or 2 and (4 or 3) '
Upvotes: 0