Reputation: 2617
I'm looking for a native python solution that would allow me to replace phrases wherever they appear within a list of strings. Basically, this looks like:
text_array = ['the store has a piano','dulcimer players are popular with the ladies','guitar','rock legends dont shy away from this gibson model or this PRS electric','guitar','fender guitar','PRS electric',...]
And I'm aiming to locate phrases (exactly) in text_array
and replace them with the string logic I have mapped out in a dict that I'm calling thesaurus
:
thesaurus = {'gibson model':'guitar', 'fender guitar':'guitar', 'PRS electric':'guitar'}
How would I iterate over each element of text_array
and replace all occurrences, wherever they appear, of phrases flagged in thesaurus
? (Note: I just want to replace exact matches and leave the rest of the string in-tact).
Desired output:
text_array = ['the store has a piano','dulcimer players are popular with the ladies','guitar','rock legends dont shy away from this guitar or this guitar', 'guitar','guitar','guitar']
Upvotes: 0
Views: 120
Reputation: 780
Here's mine :
text_array = ['the store has a piano','dulcimer players are popular with the ladies','guitar','rock legends dont shy away from this gibson model or this PRS electric','guitar','fender guitar','PRS electric',]
thesaurus = {'gibson model':'guitar', 'fender guitar':'guitar', 'PRS electric':'guitar'}
for i in range(len(text_array)):
for x,y in thesaurus.items():
text_array[i] = text_array[i].replace(x,y)
print(text_array)
Output:
['the store has a piano', 'dulcimer players are popular with the ladies', 'guitar', 'rock legends dont shy away from this guitar or this guitar', 'guitar', 'guitar', 'guitar']
Upvotes: 1
Reputation:
Presumably, there's a single match, so we could use a generator expression inside next
to search for a match in the "thesaurus":
If you want to change the original list:
for i, text in enumerate(text_array):
m = next(((k,v) for k,v in thesaurus.items() if k in text), None)
if m:
text_array[i] = text.replace(m[0], m[1])
If you want to create a new list:
for i, text in enumerate(text_array):
m = next(((k,v) for k,v in thesaurus.items() if k in text), None)
if m:
text = text.replace(m[0], m[1])
out.append(text)
You can also use pandas:
import pandas as pd
s = pd.Series(text_array)
msk = s.str.contains('|'.join(thesaurus))
s[msk] = s[msk].replace(thesaurus, regex=True)
out = s.tolist()
Output:
['the store has a piano',
'dulcimer players are popular with the ladies',
'guitar',
'rock legends dont shy away from this guitar',
'guitar',
'guitar',
'guitar']
Upvotes: 1
Reputation: 96
Using regular expressions:
import re
text_array = [
'the store has a piano',
'dulcimer players are popular with the ladies',
'guitar',
'rock legends dont shy away from this gibson model or this PRS electric',
'guitar',
'fender guitar',
'PRS electric'
]
thesaurus = {
'gibson model':'guitar',
'fender guitar':'guitar',
'PRS electric':'guitar'
}
pattern = re.compile(r'(?<!\w)(' + '|'.join(re.escape(key) for key in thesaurus.keys()) + r')(?!\w)')
for i,sentence in enumerate(text_array):
text_array[i] = pattern.sub(lambda x: thesaurus[x.group()], sentence)
print(text_array)
Output:
['the store has a piano', 'dulcimer players are popular with the ladies', 'guitar', 'rock legends dont shy away from this guitar or this guitar', 'guitar', 'guitar', 'guitar']
Upvotes: 0
Reputation: 191
This would be my approach. This one doesn't affect the original text_array
.
text_array = ['the store has a piano','dulcimer players are popular with the ladies','guitar','rock legends dont shy away from this gibson model or this PRS electric','guitar','fender guitar','PRS electric']
thesaurus = {'gibson model':'guitar', 'fender guitar':'guitar', 'PRS electric':'guitar'}
res = []
for text in text_array:
for key in thesaurus:
text = text.replace(key, thesaurus[key])
res.append(text)
print(res)
Upvotes: 2
Reputation: 144
Use this code
text_array = ['the store has a piano','dulcimer players are popular with the ladies','guitar','rock legends dont shy away from this gibson model or this PRS electric','guitar','fender guitar','PRS electric']
thesaurus = {'gibson model':'guitar', 'fender guitar':'guitar', 'PRS electric':'guitar'}
for key in thesaurus.keys():
for i,item in enumerate(text_array):
text_array[i]=item.replace(key,thesaurus[key])
print(text_array)
Result :
['the store has a piano', 'dulcimer players are popular with the ladies', 'guitar', 'rock legends dont shy away from this guitar or this guitar', 'guitar', 'guitar', 'guitar']
Upvotes: 1
Reputation: 1191
You can use the below code snippet, to get the expected output:
text_array = ['the store has a piano','dulcimer players are popular with the ladies','guitar','rock legends dont shy away from this gibson model or this PRS electric','guitar','fender guitar','PRS electric',...]
thesaurus = {'gibson model':'guitar', 'fender guitar':'guitar', 'PRS electric':'guitar'}
for index, val in enumerate(text_array):
# Checking if key exist in list item
for key in list(thesaurus.keys()):
if key in val:
# Updating List item value
text_array[index] = text_array[index].replace(key, thesaurus[key])
Upvotes: 1