Reputation: 27
I have this list of words and their corresponding POS and other values:
sentence= [[{'entity': 'adj', 'score': 0.9004535, 'index': 1, 'word': 'we', 'start': 0, 'end': 7}], [{'entity': 'verb', 'score': 0.8782018, 'index': 1, 'word': 'have', 'start': 0, 'end': 6}], [{'entity': 'verb', 'score': 0.9984743, 'index': 1, 'word': 'become', 'start': 0, 'end': 3}], [{'entity': 'noun', 'score': 0.9953852, 'index': 1, 'word': 'see', 'start': 0, 'end': 6}]]
I'm trying to extract all words that are not "verbs" or "prep". on other words, I want to exclude verbs and prepositions. I used this code:
sentence = [ sub['word'] for sub in sentence if sub['entity']!='verb' ]
But I get this error:
TypeError: list indices must be integers or slices, not str
Thank you
Upvotes: 0
Views: 6342
Reputation: 26870
Your input datum is a list of lists. Each sub-list contains a single element which is a dictionary. The fact that the individual dictionaries are in a list implies that there might be more than one dictionary in each sub-list (otherwise why would you use a list?). Your code should account for that.
The safest way to deal with this is to write a generator that iterates over both list levels and yields relevant results.
For example:
sentence= [[{'entity': 'adj', 'score': 0.9004535, 'index': 1, 'word': 'we', 'start': 0, 'end': 7}], [{'entity': 'verb', 'score': 0.8782018, 'index': 1, 'word': 'have', 'start': 0, 'end': 6}], [{'entity': 'verb', 'score': 0.9984743, 'index': 1, 'word': 'become', 'start': 0, 'end': 3}], [{'entity': 'noun', 'score': 0.9953852, 'index': 1, 'word': 'see', 'start': 0, 'end': 6}]]
# ignore any entities given in the second argument (list)
def extract(_list, ignore):
for element in _list:
for _dict in element:
if _dict.get('entity') not in ignore:
yield _dict.get('word')
for word in extract(sentence, ['verb', 'prep']):
print(word)
Output:
we
see
Upvotes: 1
Reputation: 1545
As stated in comments you are iterating over lists:
non_verb_words = [word[0]['word'] for word in sentence if word[0]['entity']!='verb']
Upvotes: 2
Reputation: 2819
Sentence is a list of list: you can try with itirate in sentence[0] = sub in sentence[0]:
sentence = [ sub['word'] for sub in sentence[0] if sub['entity']!='verb' ]
Upvotes: 1