Reputation: 65
I want to compare between 2 lists and extract the contents
colours = ["yellow", "light pink", "red", "dark blue", "red"]
items = ["the sun is yellow but the sunset is red ",
"the pink cup is pretty under the light",
"it seems like the flower is red",
"skies are blue",
"i like red"]
Expected result:
["yellow", "pink light", "red", "blue", "red"]
If there are two words in the colours list, the item will be broken down into two words. As you can see, the order of the words in colours ("pink", "light") is not important as the two words are broken into individual words and then compared individually in the sentences. Note that in the first item in items, although there is "red" in the colours list, I do not want to extract it because the "red" is in different index from the item's index.
for the 4th index which are "dark blue" and "skies are blue", the result should display only "blue" because dark is not present in the items.
I've tried to code but the results that I get is the lists are not compared within same indices by once, instead it loops through many times, hence the repeated "red".
colours=["yellow","light pink","red"," dark blue","red"]
items=["the sun is yellow but the sunset is red ","the pink cup is pretty under the light", "it seems like the flower is red", "skies are blue","i like red"]
for i in colours:
y=i.split() #split 2 words to 1 word
for j in y:
#iterate word by word in colours that have more than 1 word
for z in items:
s=z.split() #split sentences into tokens/words
for l in s:
#compare each word in items with each word in colours
if j == l:
print j
Result:
yellow
light
pink
red
red
red
blue
red
red
red
Correct Result:
yellow
pink light
red
blue
red
Upvotes: 1
Views: 62
Reputation: 109526
Using sets to test for membership should be much faster, with a caveat:
>>> [' '.join(set(colour.split()) & set(item.split()))
for colour, item in zip(colours, items)]
['yellow', 'pink light', 'red', 'blue', 'red']
The caveat is that sets are unordered, so 'pink light' could come out as 'light pink'.
Upvotes: 1
Reputation: 106465
You can use the following list comprehension:
print([' '.join(w for w in i.split() if w in c.split()) for c, i in zip(colours, items)])
This outputs:
['yellow', 'pink light', 'red', 'blue', 'red']
Upvotes: 3
Reputation: 26039
With zip
, you could do much easier:
colours=["yellow","light pink","red"," dark blue","red"]
items=["the sun is yellow but the sunset is red ","the pink cup is pretty under the light", "it seems like the flower is red", "skies are blue","i like red"]
lst = []
for x, y in zip(colours, items):
word = ''
for c in y.split():
if c in x:
word = word + ' ' + c
lst.append(word.strip())
print(lst)
# ['yellow', 'pink light', 'red', 'blue', 'red']
Upvotes: 4