Dhanendra
Dhanendra

Reputation: 113

Load python request response to tabula.read_pdf

I've got a URL that downloads the pdf as response. I want to download the pdf file using python request module and want to load the same response in the tabula module's function read_pdf in order to extract the pdfs from the pdf file. However, I want to do this in memory (without saving in disk) but the read_pdf function takes a parameter input_path which should be str, path object, or file-like object). Can anyone suggest a way to convert the response object into a file-like object?

PS:

  1. I've already tried the io module's BytesIO and StringIO but it didn't work.
  2. In tabula doc it's been mentioned that a URL to a pdf file also but I want to pass some additional parameter in request header along with the proxies which can be easily done with the request module if there is any way to pass all these args in read_pdf function that would that do.

Upvotes: 0

Views: 413

Answers (0)

Related Questions