Wickkiey
Wickkiey

Reputation: 4632

Spacy get pos & tag for specific word

I came across a situation where i have to get the pos_ & tag_ from spacy doc objects.

For example,

text = "Australian striker John hits century"
doc = nlp(text)
for nc in doc.noun_chunks:
    print(nc) #Australian striker John
doc[1].tag_ # gives for striker

if I want to get pos_ & tag_ for word 'striker' do I need to again give that sentence to nlp() ??

Also doc[1].tag_ is there, but I need something like doc['striker'].tag_ ..

Is there any possibility ?

Upvotes: 3

Views: 4950

Answers (2)

Sofie VL
Sofie VL

Reputation: 3096

You only have to process the text once:

text = "Australian striker John hits century"
doc = nlp(text)
for nc in doc.noun_chunks:
    print(nc)  
    print([(token.text, token.tag_, token.pos_) for token in nc])

If you only want to get a specific word within the noun chunk, you can further filter this by changing the second print statement to e.g.

print([(token.text, token.tag_, token.pos_) for token in nc if token.tag_ == 'NN'])

Note that this may print multiple hits, depending on your model & input sentence.

Upvotes: 2

Palash Jhamb
Palash Jhamb

Reputation: 625

You can do the following:

text = "Australian striker John hits century"
x1 = "striker"
x2 = re.compile(x1,re.IGNORECASE | re.VERBOSE)
loc_indexes = [m.start(0) for m in re.finditer(x2, text )]
tag = [i.tag_ for i in nlp(text) if i.idx in loc_indexes ]
print(x1,tag[0])

it gives output: striker NN

You can also easily make it dynamic if required with x1 being the variable.

Upvotes: 0

Related Questions