How to read PDF files using tabula in google drive

Hi Im currently trying to do some automation that will convert PDF files to CSV then load it into big query. the only issue that im having is reading and converting the PDF file from a Google Drive Folder.

I'm currently using tabula via colab.

--------------------
My code: (Note that i've already installed everything...)
import tabula

# Read pdf into list of DataFrame
df = tabula.read_pdf('/content/drive/My\ Drive/GDriveToGCS-Folder/TestPDFFileConversion.pdf', pages=2)
--------------------
Error Message: 

FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/My\\ Drive/GDriveToGCS-Folder/TestPDFFileConversion.pdf'
--------------------

Has any one tried this?

Upvotes: 0

Views: 1152

Answers (1)

korakot
korakot

Reputation: 40818

The problem is this backslash

My\ Drive

Replace it with just

My Drive

It's already inside python string quote: '/content/drive/My Drive/...'

Upvotes: 1

Related Questions