subhanshu kumar
subhanshu kumar

Reputation: 392

how to download a file using selenium in python?

I wanted to download files using python but I am not able to do it. I tried searching for the ways to do it but i dint find any relevant resource.

Here is my code:

from selenium import webdriver
driver = webdriver.Chrome('/home/user/Downloads/chromedriver')

#The below link is a pdf file and not an HTML file. I want to download this file directly.

driver.get("https://authlink-files-storage.ams3.digitaloceanspaces.com/authlink/transfered_certificates_related_docs/supporting_docs_17_2020_07_24_06_25_764ffb965d1b4ae287a0d3cc01c8dd03")

Now I want to download this file but i am not able to do it.

Upvotes: 2

Views: 981

Answers (3)

Yogesh
Yogesh

Reputation: 1

The solution to your problem is simple. To explain it better let me help you with a consideration of a scenario like downloading a file without clicking on the save as button in a present framework folder and then deleting the file after verification.

 from selenium import webdriver
 import os
 From selenium.webdriver.common.by import By
 From webdriver_manager.chrome import ChromeDriverManager

 op = webdriver.ChromeOptions()
 op.add_argument('--no-sandbox')
 op.add_argument('--verbose')
 op.add_argument("--disable-notifications")
 op.add_experimental_option("prefs", {"download.default_directory": 
 "G:/Python/Download/","download.prompt_for_download": 
 False,"download.directory_upgrade": True,"safebrowsing.enabled": True})
 op.add_argument('--disable-gpu')
 op.add_argument('--disable-software-rasterizer')
 driver = webdriver.Chrome(ChromeDriverManager().install(), 
 chrome_options=op)
 
 driver.find_element(By.XPATH, “//span[@type = ‘button’]”).click()

 def download_file_verify(self,filename):
   dir_path = "G:/Python/Download/"
    res = os.listdir(dir_path)
    try:
        name = os.path.isfile("Download/" + res[0])
        if res[0].__contains__(filename):
            print("file downloaded successfully")
    except "file is not downloaded":
        name = False
    return name

def delete_previous_file(self,filename):
    try:
        d_path = "G:/Python/Download/"
        list = os.listdir(d_path)
        for file in list:
            print("present file is: " + file)
            path = ("Download/" + file)
            if file.__contains__(filename):
                os.remove(path)
                print("Present file is deleted")
    except:
        pass

Upvotes: 0

Ayaz
Ayaz

Reputation: 279

If direct download doesn't work you can always workaround using the printing functionality:

  1. Need to use chrome options --kiosk-printing which will automatically click on print button once print dialog is opened

    options = webdriver.ChromeOptions()

    options.add_argument("--kiosk-printing")

  2. Define chrome preferences as JSON string

prefs = {"savefile.default_directory": "your destination path", "printing.default_destination_selection_rules": {"kind": "local", "idPattern": ".*", "namePattern": "Save as PDF"}}

In above prefs, default directory will be used to save your pdf in required location. second pref will select the "save as pdf" option from print dialog automatically

  1. Add pref as experimental options

    options.add_experimental_option("prefs", prefs)

  2. Define driver using chrome options and prefs

    driver = webdriver.Chrome(chrome_options=options)

  3. Once the pdf is opened in url, you can open print dialog using javascript

    driver.execute_script("window.print()")

Your pdf will be saved in the destination path with the same title

Upvotes: 2

Manoj Kumar
Manoj Kumar

Reputation: 41

Try This Code

from selenium import webdriver

download_dir = "C:\\Temp\\Dowmload"  # for linux/*nix, download_dir="/usr/Public"
options = webdriver.ChromeOptions()

profile = {"plugins.plugins_list": [{"enabled": False, "name": "Chrome PDF Viewer"}], # Disable Chrome's PDF Viewer
               "download.default_directory": download_dir , "download.extensions_to_open": "applications/pdf"}
options.add_experimental_option("prefs", profile)
driver = webdriver.Chrome('//Server/Apps/chrome_driver/chromedriver.exe', chrome_options=options)
driver.get("https://authlink-files-storage.ams3.digitaloceanspaces.com/authlink/transfered_certificates_related_docs/supporting_docs_17_2020_07_24_06_25_764ffb965d1b4ae287a0d3cc01c8dd03")

Upvotes: 1

Related Questions