Reputation: 23
I'm really new to programming and python and I've been trying to use SpaCy in my python 3.x. However, when I try to apply .pos_ to a text to find the part of speech I do not get any result for part of speech. I've made sure that SpaCy is properly installed and have browsed other Stackoverflow posts and this one github post however it did not help.
Here is the code that I used:
from spacy.lang.en import English
parser = English()
tokens = parser('She ran')
dir(tokens[0])
print(dir(tokens[0]))
def show_POS(text):
tokens = parser(text)
for token in tokens:
print(token.text, token.pos_)
show_POS("She hit the wall.")
def show_dep(text):
tokens = parser(text)
for token in tokens:
print(" {} : {} : {} :{}".format(token.orth_,token.pos_,token.dep_,token.head))
print("token : POS : dep. : head")
print("-------------------------")
show_dep("She hit the wall.")
ex1 = parser("he drinks a water")
for word in ex1:
print(word.text,word.pos_)
And here is the output:
/Users/dalals4/PycharmProjects/NLP-LEARNING/venv/bin/python
/Users/dalals4/PycharmProjects/NLP_learning_practice_chp5.py
['_', '__bytes__', '__class__', '__delattr__', '__dir__', '__doc__',
'__eq__', '__format__', '__ge__', '__getattribute__', '__gt__',
'__hash__', '__init__', '__init_subclass__', '__le__', '__len__',
'__lt__', '__ne__', '__new__', '__pyx_vtable__', '__reduce__',
'__reduce_ex__', '__repr__', '__setattr__', '__setstate__',
'__sizeof__', '__str__', '__subclasshook__', '__unicode__',
'ancestors', 'check_flag', 'children', 'cluster', 'conjuncts', 'dep',
'dep_', 'doc', 'ent_id', 'ent_id_', 'ent_iob', 'ent_iob_', 'ent_type',
'ent_type_', 'get_extension', 'has_extension', 'has_vector', 'head',
'i', 'idx', 'is_alpha', 'is_ancestor', 'is_ascii', 'is_bracket',
'is_currency', 'is_digit', 'is_left_punct', 'is_lower', 'is_oov',
'is_punct', 'is_quote', 'is_right_punct', 'is_sent_start', 'is_space',
'is_stop', 'is_title', 'is_upper', 'lang', 'lang_', 'left_edge',
'lefts', 'lemma', 'lemma_', 'lex_id', 'like_email', 'like_num',
'like_url', 'lower', 'lower_', 'n_lefts', 'n_rights', 'nbor', 'norm',
'norm_', 'orth', 'orth_', 'pos', 'pos_', 'prefix', 'prefix_', 'prob',
'rank', 'right_edge', 'rights', 'sent_start', 'sentiment',
'set_extension', 'shape', 'shape_', 'similarity', 'string', 'subtree',
'suffix', 'suffix_', 'tag', 'tag_', 'text', 'text_with_ws', 'vector',
'vector_norm', 'vocab', 'whitespace_']
She
hit
the
wall
.
token : POS : dep. : head
-------------------------
She : : : She
hit : : : hit
the : : : the
wall : : : wall
. : : : .
he
drinks
a
water
Process finished with exit code 0
Any help would be much appreciated! Thank you so much in advance :)
Upvotes: 2
Views: 1569
Reputation: 7105
The problem here is that you're only importing the English language class, which includes the language-specific data like tokenization rules. But you're not actually loading in a model, which enables spaCy to predict part-of-speech tags and other linguistic annotations.
If you haven't done so already, you first need to installed a model package, e.g. the small English model:
python -m spacy download en_core_web_sm
You can then tell spaCy to load it by calling spacy.load
:
import spacy
nlp = spacy.load('en_core_web_sm')
doc = nlp(u"she ran")
for token in doc:
print(token.text, token.pos_)
This will give you an instance of the English
class with the model weights loaded in, so spaCy can predict part-of-speech tags, dependency labels and named entities.
If you're new to spaCy, I'd recommend checking out the spaCy 101 guide in the docs. It explains the most important concepts, and includes many examples that you can run.
Upvotes: 4