sanjay
sanjay

Reputation: 1

Reading text (part of title block) from PDF generated from AutoCAD

I have a PDF file generated from the AutoCAD software. Now, I need to extract text information from the PDF for which I am using pdfminer.six. But I am unable to extract the text that is part of table (I guess this is a table/title block..). Could someone advise how we can extract such texts. I have attached the screenshot of the PDF portion that needs to be extracted. screenshot of the PDF

I tried using pdfplumber too, but no luck.

Upvotes: -2

Views: 47

Answers (1)

Mikhail Fraynt
Mikhail Fraynt

Reputation: 1

If it is searchable pdf file I would suggest using Camelot. If it is not searchable or a scan, I would use Tesseract OCR to extract text from the table. Let me know if it helps or if you need more info regarding either Tesseract or Camelot.

Upvotes: -1

Related Questions