Riley Hun
Riley Hun

Reputation: 2775

Selenium Webdriver: How to Download a PDF File with Python?

I am using selenium webdriver to automate downloading several PDF files. I get the PDF preview window (see below), and now I would like to download the file. How can I accomplish this using Google Chrome as the browser?

Dialog Box

Upvotes: 23

Views: 79759

Answers (6)

Ravi Teja
Ravi Teja

Reputation: 43

You can download the PDF file using Python's requests library

import requests
pdf_url = driver.current_url       # Get Current URL
response = requests.get(pdf_url)
file_name = 'filename.pdf'
with open(file_name, 'wb') as f:
   f.write(response.content)

Upvotes: -2

Kumar
Kumar

Reputation: 451

Try this code, it worked for me.

options = webdriver.ChromeOptions()
options.add_experimental_option('prefs', {
"download.default_directory": "C:/Users/XXXX/Desktop", #Change default directory for downloads
"download.prompt_for_download": False, #To auto download the file
"download.directory_upgrade": True,
"plugins.always_open_pdf_externally": True #It will not show PDF directly in chrome
})
self.driver = webdriver.Chrome(options=options)

Upvotes: 45

Saravana
Saravana

Reputation: 57

I found this piece of code somewhere on Stackoverflow itself and it serves the purpose for me without having to use selenium at all.

import urllib.request

response = urllib.request.urlopen(URL)    
file = open("FILENAME.pdf", 'wb')
file.write(response.read())
file.close()

Upvotes: 4

user16072805
user16072805

Reputation: 59

I did it and it worked, don't ask me how :)

options = webdriver.ChromeOptions()
options.add_experimental_option('prefs', {
#"download.default_directory": "C:/Users/517/Download", #Change default directory for downloads
#"download.prompt_for_download": False, #To auto download the file
#"download.directory_upgrade": True,
"plugins.always_open_pdf_externally": True #It will not show PDF directly in chrome 
})
driver = webdriver.Chrome(options=options)

Upvotes: 5

Umer
Umer

Reputation: 1148

In My case it worked without any code modification,Just need to disabled the Chrome pdf viewer

Here are the steps to disable it

  1. Go into Chrome Settings
  2. Scroll to the bottom click on Advanced
  3. Under Privacy And Security - Click on "Site Settings"
  4. Scroll to PDF Documents
  5. Enable "Download PDF files instead of automatically opening them in Chrome"

Upvotes: -2

Om Prakash
Om Prakash

Reputation: 2881

You can download the pdf (Embeded pdf & Normal pdf) from web using selenium.

from selenium import webdriver

download_dir = "C:\\Users\\omprakashpk\\Documents" # for linux/*nix, download_dir="/usr/Public"
options = webdriver.ChromeOptions()

profile = {"plugins.plugins_list": [{"enabled": False, "name": "Chrome PDF Viewer"}], # Disable Chrome's PDF Viewer
               "download.default_directory": download_dir , "download.extensions_to_open": "applications/pdf"}
options.add_experimental_option("prefs", profile)
driver = webdriver.Chrome('C:\\chromedriver\\chromedriver_2_32.exe', chrome_options=options)  # Optional argument, if not specified will search path.

driver.get(`pdf_url`)

It will download and save the pdf in directory specified. Change the download_dir location and chrome driver location as per your convenience.

You can download chrome driver from here.

Hope it helps!

Upvotes: 3

Related Questions