Reputation: 342
I'm trying to use the NLTK POS-tagger, but am getting a "zipfile.BadZipfile: File is not a zip file" error.
The error comes from this code:
import nltk
sentence = "I love python"
tokens = nltk.word_tokenize(sentence)
pos_tags = nltk.pos_tag(tokens)
print nltk.ne_chunk(pos_tags, binary=True)
I found this question related to my problem. Unfortunately I can't download the entire corpus since I'm working on a server and have a lot of memory restrictions. Can someone point me to the particular file I need so I can download just that one instead of the entire corpora?
(I'm using Python 2.7.6)
Upvotes: 1
Views: 3222
Reputation: 50220
Try these:
nltk.download("maxent_treebank_pos_tagger")
nltk.download("maxent_ne_chunker")
nltk.download("punkt")
The first two are for POS tagging and named entities, respectively. The third you're not using in your code sample, but you'll need it for nltk.sent_tokenize()
, which breaks up plain text into sentences. Since you'll be working with POS tags I'd also download these (they're tiny):
nltk.download(["tagsets", "universal_tagset"])
If you do have a bit of space, downloading the entire "book" collection will give you everything you need to explore the NLTK.
Upvotes: 2