Sam Dalal
Sam Dalal

Reputation: 23

.pos_ in SpaCy is not returning any results in Python

I'm really new to programming and python and I've been trying to use SpaCy in my python 3.x. However, when I try to apply .pos_ to a text to find the part of speech I do not get any result for part of speech. I've made sure that SpaCy is properly installed and have browsed other Stackoverflow posts and this one github post however it did not help.

Here is the code that I used:

from spacy.lang.en import English
parser = English()

tokens = parser('She ran')
dir(tokens[0])
print(dir(tokens[0]))


def show_POS(text):
    tokens = parser(text)
    for token in tokens:
       print(token.text, token.pos_)


show_POS("She hit the wall.")


def show_dep(text):
    tokens = parser(text)
    for token in tokens:
        print(" {} : {} : {} :{}".format(token.orth_,token.pos_,token.dep_,token.head))


print("token : POS : dep. : head")
print("-------------------------")
show_dep("She hit the wall.")

ex1 = parser("he drinks a water")
for word in ex1:
print(word.text,word.pos_)

And here is the output:

/Users/dalals4/PycharmProjects/NLP-LEARNING/venv/bin/python 
/Users/dalals4/PycharmProjects/NLP_learning_practice_chp5.py
['_', '__bytes__', '__class__', '__delattr__', '__dir__', '__doc__', 
'__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', 
'__hash__', '__init__', '__init_subclass__', '__le__', '__len__', 
'__lt__', '__ne__', '__new__', '__pyx_vtable__', '__reduce__', 
'__reduce_ex__', '__repr__', '__setattr__', '__setstate__', 
'__sizeof__', '__str__', '__subclasshook__', '__unicode__', 
'ancestors', 'check_flag', 'children', 'cluster', 'conjuncts', 'dep', 
'dep_', 'doc', 'ent_id', 'ent_id_', 'ent_iob', 'ent_iob_', 'ent_type', 
'ent_type_', 'get_extension', 'has_extension', 'has_vector', 'head', 
'i', 'idx', 'is_alpha', 'is_ancestor', 'is_ascii', 'is_bracket', 
'is_currency', 'is_digit', 'is_left_punct', 'is_lower', 'is_oov', 
'is_punct', 'is_quote', 'is_right_punct', 'is_sent_start', 'is_space', 
'is_stop', 'is_title', 'is_upper', 'lang', 'lang_', 'left_edge', 
'lefts', 'lemma', 'lemma_', 'lex_id', 'like_email', 'like_num', 
'like_url', 'lower', 'lower_', 'n_lefts', 'n_rights', 'nbor', 'norm', 
'norm_', 'orth', 'orth_', 'pos', 'pos_', 'prefix', 'prefix_', 'prob', 
'rank', 'right_edge', 'rights', 'sent_start', 'sentiment', 
'set_extension', 'shape', 'shape_', 'similarity', 'string', 'subtree', 
'suffix', 'suffix_', 'tag', 'tag_', 'text', 'text_with_ws', 'vector', 
'vector_norm', 'vocab', 'whitespace_']
She 
hit 
the 
wall 
. 
token : POS : dep. : head
-------------------------
 She :  :  : She
 hit :  :  : hit
 the :  :  : the
 wall :  :  : wall
 . :  :  : .
he 
drinks 
a 
water 

Process finished with exit code 0

Any help would be much appreciated! Thank you so much in advance :)

Upvotes: 2

Views: 1569

Answers (1)

Ines Montani
Ines Montani

Reputation: 7105

The problem here is that you're only importing the English language class, which includes the language-specific data like tokenization rules. But you're not actually loading in a model, which enables spaCy to predict part-of-speech tags and other linguistic annotations.

If you haven't done so already, you first need to installed a model package, e.g. the small English model:

python -m spacy download en_core_web_sm

You can then tell spaCy to load it by calling spacy.load:

import spacy

nlp = spacy.load('en_core_web_sm')
doc = nlp(u"she ran")
for token in doc:
    print(token.text, token.pos_)

This will give you an instance of the English class with the model weights loaded in, so spaCy can predict part-of-speech tags, dependency labels and named entities.

If you're new to spaCy, I'd recommend checking out the spaCy 101 guide in the docs. It explains the most important concepts, and includes many examples that you can run.

Upvotes: 4

Related Questions