What's the best way to extract text from pdf in python without changing the layout and format?

Question

I want text with exact format and layout from pdf.
If pdf to text is not the direct choice, is it possible to do pdf -> xml -> text?
I have already tried PyPDF2, pdfminer and pdftotxt. Even I've tried using AWS textract and got incorrect layout.
Basically if I can construct sentence from the text extracted from pdf, that's enough.
I used Zamzar API which gives exact output but they're quiet expensive. Any possible solution?

What's the best way to extract text from pdf in python without changing the layout and format?

Answers (1)

Related Questions

What&#39;s the best way to extract text from pdf in python without changing the layout and format?

Answers (1)

Related Questions

What's the best way to extract text from pdf in python without changing the layout and format?