NLTK Tree Format is not as docs show it

Question

On the NLTK docs, this is how printing a tree (in this case, 'entities') is shown to be:

import nltk
sentence = """At eight o'clock on Thursday morning
Arthur didn't feel very good."""
tokens = nltk.word_tokenize(sentence)
tagged = nltk.pos_tag(tokens)
entities = nltk.chunk.ne_chunk(tagged)

entities
Tree('S', [('At', 'IN'), ('eight', 'CD'), ("o'clock", 'JJ'),
           ('on', 'IN'), ('Thursday', 'NNP'), ('morning', 'NN'),
       Tree('PERSON', [('Arthur', 'NNP')]),
           ('did', 'VBD'), ("n't", 'RB'), ('feel', 'VB'),
           ('very', 'RB'), ('good', 'JJ'), ('.', '.')])

But when I try to do the exact same thing with the exact same code, this is what happens:

entities
(S
  At/IN
  eight/CD
  o'clock/NN
  on/IN
  Thursday/NNP
  morning/NN
  (PERSON Arthur/NNP)
  did/VBD
  n't/RB
  feel/VB
  very/RB
  good/JJ
  ./.)

In case you haven't caught on, I would like the output of my code (which is the exact same code) to be formatted like the output of the code from the docs.

I have tried this on both python 2.7 and python 3.5, with the same results. Is there a fix? Perhaps I'm just missing some nltk add-on? If there is a fix, I'd prefer python 2.7.

alexis · Accepted Answer

Are you sure you just typed entities to get the result you report? What you see in the nltk homepage is the unambiguous representation of the tree object (its “repr” form, in python terms). You get when you dump a variable by just typing its name at the prompt. If you print out the tree with print(entities) (which is probably what you actually did), it provides the customized, more readable form without the Tree types and tuple notation.

So there is no problem, and no fix is needed. These are two representations of the same object. If you have to use print to see the variable's content (e.g., you are not at the interactive prompt) but you want to match the output you see in the example, you can use print(repr(entities)).

NLTK Tree Format is not as docs show it

Answers (1)

Related Questions