Omega
Omega

Reputation: 871

How to download multiple files with for loop

I'm stuck on what should be a fairly simple problem. But I'm a beginner coder so it's not obvious to me. I'm trying to download images from a website using dynamic names. I think what happens is that I'm overwriting the same file over and over again or that I'm only downloading the last file (America's favourite sports). It works if I hardcode the file name or limit the download to only one file, but that's not the point obviously. Otherwise I get an error saying: No such file or directory: 'C:\\My File Path\\Images\\John Wick: Chapter 1.jpg' Can someone point me into the right direction please?

driver = webdriver.Chrome(executable_path=r'C:\Program Files\chromedriver.exe')
driver.get("https://public.tableau.com/en-gb/gallery/?tab=viz-of-the-day&type=viz-of-the-day")
wait = WebDriverWait(driver, 10)

vizzes = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".gallery-items-list 
div.gallery-list-item-container")))
for viz in vizzes:

    #name of the viz
    viz_name = viz.find_element_by_class_name("gallery-list-item-title-left").text

    #get image links
    images = viz.find_element_by_xpath(".//img[@data-test-id='galleryListItem-thumbnail-image']")
    image_link = images.get_attribute("src")

    #download images 
    myfile = requests.get(image_link)

    with open("C:\My File Path\Images" + "\\" + viz_name + ".jpg", "wb") as f:
            f.write(myfile.content)

time.sleep(5)

driver.close()

Upvotes: 0

Views: 820

Answers (1)

Matthew Gaiser
Matthew Gaiser

Reputation: 4763

Certain characters can't go in file names. Problem is, any character can go in a title.

You can't have colons (:), you can't have question marks (?), you can't have spaces, etc. The problem is, your titles have all of these things. You need a function to convert your titles to names which can be properly used as file names.

Here is the function I used:

def valid_file_name(name):
    return name.replace(" ", "_").replace("?","").replace(":","")

Here is where I put it:

    with open("C:\\Users\\Matthew\\Pictures\\dumping" + "\\" + valid_file_name(viz_name) + ".jpg", "wb") as f:
            f.write(myfile.content)

The full and complete code is below and it works for me. Make sure to change the image folder to the one you want to use.

from selenium import webdriver
import requests
import time
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

def valid_file_name(name):
    return name.replace(" ", "_").replace("?","").replace(":","")

driver = webdriver.Chrome()
driver.get("https://public.tableau.com/en-gb/gallery/?tab=viz-of-the-day&type=viz-of-the-day")
wait = WebDriverWait(driver, 15)

vizzes = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".gallery-items-list div.gallery-list-item-container")))
for viz in vizzes:

    #name of the viz
    viz_name = viz.find_element_by_class_name("gallery-list-item-title-left").text

    #get image links
    images = viz.find_element_by_xpath(".//img[@data-test-id='galleryListItem-thumbnail-image']")
    image_link = images.get_attribute("src")

    #download images
    myfile = requests.get(image_link)

    print(valid_file_name(viz_name))
    with open("C:\\Users\\Matthew\\Pictures\\dumping" + "\\" + valid_file_name(viz_name) + ".jpg", "wb") as f:
            f.write(myfile.content)

time.sleep(5)

driver.close()

Upvotes: 2

Related Questions