Can pdfplumber extract tables for my scanned pdfs?

Question

(I know that pdfplumber is mainly geared towards computer-generated PDFs. However, before I spend a couple of days handtyping data from my scanned PDFs, I thought I'd ask if pdfplumber could somehow help me.)

My problem:
I have scanned PDFs from historical books.
Example: Data from statistical yearbook
Now I'm trying to extract the table (the one in the lower-right in the example) from the scanned PDF.

My first attempts at extracting the table with pdfplumber didn't work.
e.g.

with pdfplumber.open('test.pdf') as pdf:
page = pdf.pages[0]
tables = page.extract_tables()
print(tables)

returned None

Is there any hope that I will be able to extract this kind of data non-manually? Or should I just suck it up?

Thanks in advance for any help or advice!

Can pdfplumber extract tables for my scanned pdfs?

Answers (1)

Related Questions