Steven Matthews
Steven Matthews

Reputation: 11295

Two lists, one of words, one of phrases

Ok, so I have two lists, one of words, like so:

["happy", "sad", "angry", "jumpy"]

etc

And then a list of phrases, like so:

["I'm so happy with myself lately!", "Johnny, im so sad, so very sad, call me", "i feel like crap. SO ANGRY!!!!"]

I want to use the first list of words, to find the incidences of words in the list of phrases. I don't care if I pull the actual words, separated by spaces or just the number of times they occur.

From what I've looked into, it appears that the re module as well as filters are the way to go?

Also, if my explanation of what I need is unclear, please let me know.

Upvotes: 2

Views: 213

Answers (3)

Joel Cornett
Joel Cornett

Reputation: 24788

>>> phrases = ["I'm so happy with myself lately!", "Johnny, im so sad, so very sad, call me", "i feel like crap. SO ANGRY!!!!"]
>>> words = ["happy", "sad", "angry", "jumpy"]
>>> words_in_phrases = [re.findall(r"\b[\w']+\b", phrase.lower()) for phrase in phrases]
>>> words_in_phrases
[["i'm", 'so', 'happy', 'with', 'myself', 'lately'], ['johnny', 'im', 'so', 'sad', 'so', 'very', 'sad', 'call', 'me'], ['i', 'feel', 'like', 'crap', 'so', 'angry']]
>>> word_counts = [{word: phrase.count(word) for word in words} for phrase in words_in_phrases]
>>> word_counts
[{'jumpy': 0, 'angry': 0, 'sad': 0, 'happy': 1}, {'jumpy': 0, 'angry': 0, 'sad': 2, 'happy': 0}, {'jumpy': 0, 'angry': 1, 'sad': 0, 'happy': 0}]
>>> 

For the line word_counts = [{word: phrase.count(word) for word in words} for..., you need Python 2.7+. If, for some reason, you're using < Python 2.7, replace that line with the following:

>>> word_counts = [dict((word, phrase.count(word)) for word in words) for phrase in words_in_phrases]

Upvotes: 1

Katriel
Katriel

Reputation: 123662

>>> phrases = ["I'm so happy with myself lately!", "Johnny, im so sad, so very sad, call me", "i feel like crap. SO ANGRY!!!!"]
>>> words = ["happy", "sad", "angry", "jumpy"]
>>> 
>>> for phrase in phrases:
...     print phrase
...     print {word: phrase.count(word) for word in words}
... 
I'm so happy with myself lately!
{'jumpy': 0, 'angry': 0, 'sad': 0, 'happy': 1}
Johnny, im so sad, so very sad, call me
{'jumpy': 0, 'angry': 0, 'sad': 2, 'happy': 0}
i feel like crap. SO ANGRY!!!!
{'jumpy': 0, 'angry': 0, 'sad': 0, 'happy': 0}

Upvotes: 4

poke
poke

Reputation: 387835

Very simple, straight-forward solution:

>>> phrases = ["I'm so happy with myself lately!", "Johnny, im so sad, so very sad, call me", "i feel like crap. SO ANGRY!!!!"]
>>> words = ["happy", "sad", "angry", "jumpy"]
>>> for phrase in phrases:
        for word in words:
            if word in phrase:
                print('"{0}" is in the phrase "{1}".'.format(word, phrase))

"happy" is in the phrase "I'm so happy with myself lately!".
"sad" is in the phrase "Johnny, im so sad, so very sad, call me".

Upvotes: 2

Related Questions