jjjack1
jjjack1

Reputation: 11

Unexpected EOF, using slate to parse PDF file on Python 2.7.12

I picked up O'Reiley's Data Wrangling with Python by Jacqueline Kazil and Katherine Karmul. In ch.5, pg.94, I'm running the following code.

import slate

pdf = 'EN-FINAL Table 9.pdf'

with open(pdf) as f:
    doc = slate.PDF(f)

for page in doc[:2]:
    print page

I'm using Windows 10, Python 2.7.12 , running slate 0.5.2, pdfminer 20140328 and successfully installed pip. I got the following result:

File "C:\Python27\lib\site-packages\pdfminer\psparser.py", line 215, in fillbuf
    raise PSEOF('Unexpected EOF')
 pdfminer.psparser.PSEOF: Unexpected EOF

I only know that EOF means 'end of file' and no more data can be read from data source. Does anybody have an idea as to what happened?

If anybody would like to see what file I'm trying to parse, it's right here: https://github.com/jackiekazil/data-wrangling/tree/master/data/chp5

Upvotes: 1

Views: 5180

Answers (1)

neonB
neonB

Reputation: 11

This solved it for me: https://stackoverflow.com/a/18262661/6843645

Your code would be:

import slate

pdf = 'EN-FINAL Table 9.pdf'
with open(pdf, 'rb') as f:
    doc = slate.PDF(f)

for page in doc[:2]:
    print page

Upvotes: 1

Related Questions