user47467
user47467

Reputation: 1093

Python possible list comprehension

I have a text file and two lists of strings.

The first list is the keyword list

k = [hi, bob]

The second list is the words I want to replace the keywords with

r = [ok, bye]

I want to take the text file as input, where when k appears, it's replaced with r, thus, "hi, how are you bob" would be changed to "ok, how are you bye"

Upvotes: 4

Views: 114

Answers (2)

tobias_k
tobias_k

Reputation: 82899

I'll assume you've got the "reading string from file" part covered, so about that "replacing multiple strings" part: First, as suggested by Martijn, you can create a dictionary, mapping keys to replacements, using dict and zip.

>>> k = ["hi", "bob"]
>>> r = ["ok", "bye"]
>>> d = dict(zip(k, r))

Now, one way to replace all those keys at once would be to use a regular expression, being a disjunction of all those keys, i.e. "hi|bob" in your example, and using re.sub with a replacement function, looking up the respective key in that dictionary.

>>> import re
>>> re.sub('|'.join(k), lambda m: d[m.group()], "hi, how are you bob")
'ok, how are you bye'

Alternatively, you can just use a loop to replace each key-replacement pair one after the other:

s = "hi, how are you bob"
for (x, y) in zip(k, r):
    s = s.replace(x, y)

Upvotes: 0

Régis B.
Régis B.

Reputation: 10598

Let's say you have already parsed your sentence:

sentence = ['hi', 'how', 'are', 'you', 'bob']

What you want to do is to check whether each word in this sentence is present in k. If yes, replace it by the corresponding element in r; else, use the actual word. In other words:

if word in k:
    word_index = k.index(word)    
    new_word = r[word_index]

This can be written in a more concise way:

new_word = r[k.index(word)] if word in k else word

Using list comprehensions, here's how you go about processing the whole sentence:

new_sentence = [r[k.index(word)] if word in k else word for word in sentence]

new_sentence is now equal to ['ok', 'how', 'are', 'you', 'bye'] (which is what you want).

Note that in the code above we perform two equivalent search operations: word in k and k.index(word). This is inefficient. These two operations can be reduced to one by catching exceptions from the index method:

def get_new_word(word, k, r):
    try:
        word_index = k.find(word)
        return r[word_index]
    except ValueError:
        return word

new_sentence = [get_new_word(word, k, r) for word in sentence]

Now, you should also note that searching for word in sentence is a search with O(n) complexity (where n is the number of keywords). Thus the complexity of this algorithm is O(n.m) (where is the sentence length). You can reduce this complexity to O(m) by using a more appropriate data structure, as suggested by the other comments. This is left as an exercise :-p

Upvotes: 1

Related Questions