goalguy
goalguy

Reputation: 157

Get text from PDF in Google

I have a PDF document that is saved in Google Drive. I can use the Google Drive Web UI search to find text in the document.

How can I programmatically extract a portion of the text in the document using Google Apps Script?

Upvotes: 5

Views: 13826

Answers (1)

Mogsdad
Mogsdad

Reputation: 45750

See pdfToText() in this gist.

To invoke the OCR built in to Google Drive on a PDF file, e.g. myPDF.pdf, here is what you do:

function myFunction() {
  var pdfFile = DriveApp.getFilesByName("myPDF.pdf").next();
  var blob = pdfFile.getBlob();

  // Get the text from pdf
  var filetext = pdfToText( blob, {keepTextfile: false} );

  // Now do whatever you want with filetext...
}

Upvotes: 10

Related Questions