HJA24
HJA24

Reputation: 365

Negation and dependency parsing with spaCy

Sentiment words behave very differently when under the semantic scope of negation. I want to use a slightly modified version of Das and Chen (2001) They detect words such as no, not, and never and then append a "neg"-suffix to every word appearing between a negation and a clause-level punctuation mark. I want to create something similar with dependency parsing from spaCy.

import spacy
from spacy import displacy

nlp = spacy.load('en')
doc = nlp(u'$AAPL is óóóóópen to ‘Talk’ about patents with GOOG definitely not the treatment #samsung got:-) heh')

options = {'compact': True, 'color': 'black', 'font': 'Arial'}
displacy.serve(doc, style='dep', options=options)

Visualized dependency paths:

enter image description here

Nicely, there exists a negation modifier in the dependency tag scheme; NEG

In order to identify negation I use the following:

 negation = [tok for tok in doc if tok.dep_ == 'neg']

Now I want to retrieve the scope of the negations.

import spacy
from spacy import displacy
import pandas as pd

nlp = spacy.load("en_core_web_sm")
doc = nlp(u'AAPL is óóóóópen to Talk about patents with GOOG definitely not the treatment got')

print('DEPENDENCY RELATIONS')
print('Key: ')
print('TEXT, DEP, HEAD_TEXT, HEAD_POS, CHILDREN')

for token in doc:
    print(token.text, token.dep_, token.head.text, token.head.pos_,
      [child for child in token.children])

This gives the following output:

DEPENDENCY RELATIONS
Key: 
TEXT, DEP, HEAD_TEXT, HEAD_POS, CHILDREN
AAPL nsubj is VERB []
is ROOT is VERB [AAPL, óóóóópen, got]
óóóóópen acomp is VERB [to]
to prep óóóóópen ADJ [Talk]
Talk pobj to ADP [about, definitely]
about prep Talk NOUN [patents]
patents pobj about ADP [with]
with prep patents NOUN [GOOG]
GOOG pobj with ADP []
definitely advmod Talk NOUN []
not neg got VERB []
the det treatment NOUN []
treatment nsubj got VERB [the]
got conj is VERB [not, treatment]

How to filter out only the token.head.text of not, so got and it's locating? Can someone help me out?

Upvotes: 10

Views: 9210

Answers (1)

Sofie VL
Sofie VL

Reputation: 3106

You can simply define and loop through the head tokens of the negation tokens you found:

negation_tokens = [tok for tok in doc if tok.dep_ == 'neg']
negation_head_tokens = [token.head for token in negation_tokens]

for token in negation_head_tokens:
    print(token.text, token.dep_, token.head.text, token.head.pos_, [child for child in token.children])

which prints you the information for got.

Upvotes: 9

Related Questions