Reputation: 13
I've been learning Python for a couple weeks and i wanted to write a script that would automate a very tedious task i have to do for my work where i basically have to scour through a directory of over 2000 Charities in my city and get some info and make an Excel sheet out of it. Here's what i've got so far
'Book3.xlsx' contains a list of URL's that i'd scraped with a different script. The script above accesses the link, scrapes the information i want and saves it in 'Book4.xlsx'. But as you can see if can it just one at a time.
I've been struggling to automate it. Have tried (with my very limited knowledge) loops, defining a function of the above script and looping it etc but no cigar as of yet.
Any help, tips etc would be greatly appreciated. Thank You!
Upvotes: 1
Views: 43
Reputation: 238
Have you considered using the .cell() method ? I'm not an openpyxl expert, but this could work using a for
loop :
#
# You load your sheets
#
# You make your 'for' loop as long as you want in order to go over every URL in 'Book3.xslx'
for i in range(2, 11):
source = requests.get(sheet.cell(row=i, column=1).value).text
#
# You select the info you want
#
outputsheet.cell(row=i, column=1) = name
outputsheet.cell(row=i, column=2) = mobile + " " + phone
outputsheet.cell(row=i, column=3) = email
outputsheet.cell(row=i, column=4) = website
outputwb.save("Book4.xslx")
EDIT : you would have to stock all the URLs in the first column of "Book3.xslx" (column A), and the output would be, for each URL, written in a different row (1, 2, 3, etc.). Here in this particular exemple, there is only 8 URLs (from line 2 to 10 (including 10)).
Upvotes: 1