user1767774
user1767774

Reputation: 1825

POS tagging a single word in spaCy

spaCy POS tagger is usally used on entire sentences. Is there a way to efficiently apply a unigram POS tagging to a single word (or a list of single words)?

Something like this:

words = ["apple", "eat", good"]
tags = get_tags(words) 
print(tags)
> ["NNP", "VB", "JJ"]

Thanks.

Upvotes: 1

Views: 2398

Answers (2)

Amir Imani
Amir Imani

Reputation: 3237

You can do something like this:

import spacy
nlp = spacy.load("en_core_web_sm")

word_list = ["apple", "eat", "good"]
for word in word_list:
   doc = nlp(word)
   print(doc[0].text, doc[0].pos_)

alternatively, you can do

import spacy
nlp = spacy.load("en_core_web_sm")

doc = spacy.tokens.doc.Doc(nlp.vocab, words=word_list)

for name, proc in nlp.pipeline:
    doc = proc(doc)

pos_tags = [x.pos_ for x in doc]

Upvotes: 2

aab
aab

Reputation: 11474

English unigrams are often hard to tag well, so think about why you want to do this and what you expect the output to be. (Why is the POS of apple in your example NNP? What's the POS of can?)

spacy isn't really intended for this kind of task, but if you want to use spacy, one efficient way to do it is:

import spacy
nlp = spacy.load('en')

# disable everything except the tagger
other_pipes = [pipe for pipe in nlp.pipe_names if pipe != "tagger"]
nlp.disable_pipes(*other_pipes)

# use nlp.pipe() instead of nlp() to process multiple texts more efficiently
for doc in nlp.pipe(words):
    if len(doc) > 0:
        print(doc[0].text, doc[0].tag_)

See the documentation for nlp.pipe(): https://spacy.io/api/language#pipe

Upvotes: 5

Related Questions