Reputation: 13

replace all word occurrence in list with index of items in another list

I have a list -

A=["hi how are you","have good day","where are you going ","do you like the place"]

and another list -

B=["how","good","where","going","like","place"]

List B includes some of words that exist in list A. I want to replace all words in List B that occur in List A by their index in list B. If word doesn't exist replace it with 0

So list A after the replacement should be

["0 1 0 0","0 2 0","3 0 0 4","0 0 5 0 6"]

I tried using for loop but it's not effiecent as my list length is > 10000. I also tried using map function but i wasn't successful

Here is my attempt :

for item in list_A:
    words=sorted(item.split(), key=len,reverse=True)
    for w in word:
        if w.strip() in list_B:
            item=item.replace(w,str(list_B.index(w.strip())))
        else:
            item=item.replace(w,0)

Upvotes: 0

Answers (5)

Transhuman

Reputation: 3547

This is in Python 3.x

A=["hi how are you","have good day","where are you going ","do you like the place"]
B=["how","good","where","going","like","place"]
list(map(' '.join, map(lambda x:[str(B.index(i)+1) if i in B else '0' for i in x], [i.split() for i in A])))

Output:

['0 1 0 0', '0 2 0', '3 0 0 4', '0 0 5 0 6']

Upvotes: 0

stamaimer

Reputation: 6475

You should define a function to return index of word in second list:

def get_index_of_word(word):
    try:
        return str(B.index(word) + 1)
    except ValueError:
        return '0'

And then, You can use nested list comprehension to generate the result:

[' '.join(get_index_of_word(word) for word in sentence.split()) for sentence in A]

UPDATE

from collections import defaultdict

index = defaultdict(lambda: 0, ((word, index) for index, word in enumerate(B, 1))

[' '.join(str(index[word]) for word in sentence.split()) for sentence in A]

Upvotes: 1

Garrigan Stafford

Reputation: 1403

What you could do is create a dictionary that maps each word in list B to it's index. Then you only have to iterate through the first list once.

Something like

B = ["how","yes"]
BDict = {}
index = 0
for x in B:
    Bdict[x] = index
    index += 1

for sentence in A:
     for word in sentence:
         if word in BDict:
              #BDict[word] has the index of the current word in B
         else:
              #Word does not exist in B

This should significantly decrease runtime since dictionary has O(1) access time. However, depending on the size of B the dictionary could become quite large

EDIT: Your code works, the reason it is slow is that the in and index operator have to perform a linear search when you are using a list. So if B gets large this can be a big slow down. A dictionary however has a constant time required to see if a key exists in the dictionary and for retrieving the value. By using the dictionary you would replace 2 O(n) operations with O(1) operations.

Upvotes: 1

cgte

Reputation: 450

Hi your solution is making (too) many lookups.

here is mine:

A=["hi how are you",
   "have good day",
   "where are you going ",
   "do you like the place"]

B=["how","good","where","going","like","place"]

# I assume B contains only unique elements.

gg = { word: idx for (idx, word) in enumerate(B, start=1)}
print(gg)

lookup = lambda word: str(gg.get(word, 0)) # Buils your index and gets you efficient search with proper object types.

def translate(str_):
    return ' '.join(lookup(word) for word in str_.split())        

print(translate("hi how are you")) # check for one sentence.


translated =  [translate(sentence) for sentence in A] # yey victory.

print(translated)

# Advanced usage

class  missingdict(dict):
    def __missing__(self, key):
        return 0

miss = missingdict(gg)

def tr2(str_):
    return ' '.join(str(miss[word]) for word in str_.split())


print([tr2(sentence) for sentence in A])

You may also be using the yield keyword when you will be more self-confident in python.

Upvotes: 0

Ajax1234

Reputation: 71451

You can try this:

A=["hi how are you","have good day","where are you going ","do you like the place"]
A = map(lambda x:x.split(), A)
B=["how","good","where","going","like","place"]
new = [[c if d == a else 0 for c, d in enumerate(i)] for i in A for a in B]

final = map(' '.join, map(lambda x: [str(i) for i in x], new))

print final

Upvotes: 0

replace all word occurrence in list with index of items in another list

Answers (5)

Related Questions