Reputation: 89
This script downloads images, renames them based on line[0]
adds a number to the end of the file name and saves them to a file folder. My goal here is to create the file folder name based on line[0]
of my csv file, I'm new to python and need to download/sort 15000+ images. Any help would be appreciated! Using python 3.8.6
Note: 1 model may contain many images so the idea is to create the folder, place images for that model inside, move on to the next model, etc.
Csv file content
RHV3-484-L,https://www.fisherpaykel.com/on/demandware.static/-/Sites-fpa-master-catalog/default/dw0c85e188/product-mugs/cooking/ranges/mug/retail/RHV3-484-N-RHV3-484-L-external-mug-rs-84.png
RHV3-484-L,https://www.fisherpaykel.com/on/demandware.static/-/Sites-fpa-master-catalog/default/dwcbd711e5/inspire/caitlin-wilson-portland-dk-339-rs-84.png
RHV3-484-L,https://www.fisherpaykel.com/on/demandware.static/-/Sites-fpa-master-catalog/default/dw3702e52a/inspire/caitlin-wilson-portland-dk-385-rs-84.jpg
RHV3-484-L,https://www.fisherpaykel.com/on/demandware.static/-/Sites-fpa-master-catalog/default/dw0c85e188/product-mugs/cooking/ranges/mug/retail/RHV3-484-N-RHV3-484-L-external-mug-rs-84.png
RHV3-484-L,https://www.fisherpaykel.com/on/demandware.static/-/Sites-fpa-master-catalog/default/dwf99a5a9d/inspire/david-berridge-project-brooklyn-mw-6457-rs-84.jpg
Python script
import sys
import urllib
import urllib.request
from csv import reader
import os.path
import os
csv_filename = "images"
with open(csv_filename+".csv".format(csv_filename), 'r') as csv_file:
n = 1
for line in reader(csv_file):
if not os.path.exists("ImageID"):
os.makedirs("ImageID")
print("Image skipped for {0}".format(line[0]))
else:
if line[1] != '' and line[0] != "ImageID":
urllib.request.urlretrieve(line[1], "ImageID/" + line[0] + "-" + str(n) + ".jpg")
n += 1
print("Image saved for {0}".format(line[0]))
else:
print("No result for {0}".format(line[0]))
Upvotes: 0
Views: 65
Reputation: 11883
This seems to work as desired....
Couple comments in middle. Notably, you need to respect the .jpg or .png file. If you have file extensions that are longer (4 chars) you may need to split out the file name and then split by "."
Good Luck!
import sys
import urllib
import urllib.request
from csv import reader
import os.path
import os
csv_filename = "images.csv"
with open(csv_filename, 'r') as csv_file:
n = 1 # starting point
for line in reader(csv_file):
tgt_folder = line[0]
if not os.path.exists(tgt_folder):
os.makedirs(tgt_folder)
n = 1 # restart n if you find a NEW folder
# there should be no "else" clause here. Just test the folder name above, but don't waste a file
if line[1] != '' and line[0] != "ImageID": # not clear what this is for... ??
filename = ''.join([line[0], '-', str(n), line[1][-4:]])
destination = os.path.join(tgt_folder, filename)
urllib.request.urlretrieve(line[1], destination)
n += 1
print("Image saved for {0}".format(line[1]))
else:
print("No result for {0}".format(line[1]))
Upvotes: 1