André Sousa
André Sousa

Reputation: 23

Selenium Python, How can I download a PDF to a specific location without having the "url"

I've been writing a code with python using Selenium that should access a webpage and download a pdf. But, when the driver clicks on the button it generates a new tab with the pdf, and I can't use that URL to download the PDF. Can anyone help me, please?

(example: if I ask my driver to "get" the PDF "URL", the driver opens the page I was before, the one it had the button that opens the PDF Chrome previewer)

If the problem seems understandable please inform me so I can try to explain it better.

Upvotes: 2

Views: 559

Answers (1)

Celius Stingher
Celius Stingher

Reputation: 18367

It seems that the default configuration of chrome is to disable the download for security reasons. You may change this in the options. I am attaching a working example based on Arxiv which has safe pdf downloads:

options = webdriver.ChromeOptions()
options.add_experimental_option('prefs', {
"download.default_directory": os.path.join(os.getcwd(),"Downloads"), #Set directory to save your downloaded files.
"download.prompt_for_download": False, #Downloads the file without confirmation.
"download.directory_upgrade": True,
"plugins.always_open_pdf_externally": True #Disable PDF opening.
})

driver = webdriver.Chrome(os.path.join(os.getcwd(),"Downloads","chromedriver"),options=options) #Replace with correct path to your chromedriver executable.


driver.get("https://arxiv.org/list/hep-lat/1902") #Base url


driver.find_elements(By.XPATH,"/html/body/div[5]/div/dl/dt[1]/span/a[2]")[0].click() #Clicks the link that would normally open the PDF, now download. Change to fit your needs

Upvotes: 2

Related Questions