Reputation: 13
I am trying to extract verb and verb phrases from a column containing sentences for this purpose i have created a function called tagging. Here's my code:
pattern = [{'POS':'VERB'}]
def tagging(txt):
verb_phrases = textacy.extract.matches(txt, patterns=pattern)
return (verb_phrases)
and then i am applying this function a column from my dataset
dataset['Verbs'] = dataset['Sentences'].apply(lambda x: tagging(x))
dataset['Verbs']
But instead of returning me verbs, the output is like:
0 <generator object matches at 0x7f8eb98a5258>
1 <generator object matches at 0x7f8eb97df6d0>
2 <generator object matches at 0x7f8eb97df728>
3 <generator object matches at 0x7f8eb97df570>
4 <generator object matches at 0x7f8eb97df678>
Upvotes: 1
Views: 2020
Reputation: 1570
Going through the docs of textacy.extract.matches
, this function does not return
but rather yield
s a Span and that's why you get generators.
Yields:
Next matching ``Span`` in ``doc``, in order of appearance
A common way of unrolling is list comprehension ([generator]
/ list(generator)
) or iterating through it (for item in generator:
)
For your case:
for verb_span in textacy.extract.matches(txt, patterns=pattern):
print(verb_span)
verb_list = [verb_span for verb_span in textacy.extract.matches(txt, patterns=pattern)]
Upvotes: 1
Reputation: 3294
Try this:
pattern = [{'POS':'VERB'}]
def tagging(txt):
verb_phrases = textacy.extract.matches(txt, patterns=pattern)
return list(verb_phrases)
Upvotes: 0