ExperimentsWithCode
ExperimentsWithCode

Reputation: 1184

Downloading PDF's with Webdriver

I am trying to download pdfs with selenium webdriver with python bindings on OS X 10.8.

I actually need the pdf file, not just check if it the download link works. As I understand it, I need to set the firefox profile to download the pdf content type, rather than 'preview' which is the default.

My code to open an instance of firefox is:

def Engage():
    print "Start Up FIREFOX"
    ## Create a new instance of the Firefox driver
    profile = webdriver.firefox.firefox_profile.FirefoxProfile()
    profile.set_preference('browser.download.folderList', 2)
    profile.set_preference('browser.download.dir', os.path.expanduser("~/Documents/PYTHON/Download_Files/tmp/"))
    profile.set_preference('browser.helperApps.neverAsk.saveToDisk', ('application/pdf'))
    driver = webdriver.Firefox(firefox_profile=profile)
    return driver

I have also tried initially setting the profile as :

profile = webdriver.FirefoxProfile()
## replacing :: profile = webdriver.firefox.firefox_profile.FirefoxProfile()
## the other attributes remained

This has the same results

This profile opens the pdf in preview mode in a new window, rather than download it.

I double checked the content type through requests and was able to confirm it as "application/pdf":

import requests
print requests.head('mywebsite.com').headers['content-type']

Any idea of what I am doing wrong?

Upvotes: 0

Views: 1285

Answers (2)

drunkel
drunkel

Reputation: 228

I was faced with the same problem. My code is in Java, but I'm sure you can transfer the properties to match in python. Here is what worked or me (note you need to specify a path to a directory that you can write to):

FirefoxProfile profile = new FirefoxProfile();
profile.setPreference( "browser.download.folderList", 2 );
profile.setPreference( "browser.download.dir", <YOUR DOWNLOAD PATH> );
profile.setPreference( "plugin.disable_full_page_plugin_for_types", "application/pdf" );
profile.setPreference(
                "browser.helperApps.neverAsk.saveToDisk",   
"application/csv,text/csv,application/pdfss, application/excel" );
profile.setPreference( "browser.download.manager.showWhenStarting", false );
profile.setPreference( "pdfjs.disabled", true );

Upvotes: 0

Vinay
Vinay

Reputation: 734

Was facing a similar situation sometime back. Solution is quite easy. By default the settings in firefox opens pdf files rather than allowing you to download it. To overcome this type config:about in the browser and type pdfjs.disabled double click on the option. The value should change from false to true. Restart the browser and try opening any pdf file. It will download the file instead of opening it in the browser. Happy coding.

Upvotes: 1

Related Questions