Scrape URL provided by CSV

Question

I have a CSV containing some columns with data, the 15th column reports a list of URLs. Now, I need to select each URL from the column, scrape a new price from the target webpage, and then store that in the price column to update the old price.

Without the same column enumeration, here is an approximate CSV:

asin,title,product URL,price
KSKFUSH01,Product Title,http://....,56.00

Below is the sample code I wrote, but it merely prints URLs :(

import csv

with open('some.csv', 'r') as csv_file:
csv_reader = csv.reader(csv_file)

for line in csv_reader:
    print(line[15])

Any help or suggestions about accomplishing this goal?

Thanks

user9048861 · Accepted Answer

It looks like you want to use a csv writer. You can access the URL in each line. Here is how you can write the new price.

import csv
import urllib2
from bs4 import BeautifulSoup
with open('some.csv', 'r') as csv_file:
csv_reader = csv.reader(csv_file)

with open('newPricedata.csv', 'w', newline='') as Newcsvfile:
Pricewriter = csv.writer(Newcsvfile, delimiter=' ',
                        quotechar='|', quoting=csv.QUOTE_MINIMAL)
for line in csv_reader:
page = urllib2.urlopen(line[15])
soup = BeautifulSoup(page, ‘html.parser’)
price = soup.find(‘td’, attrs={‘class’: ‘a-size-mini a-color-price ebooks-price-savings a-text-normal'})
Pricewriter.writerow(line[0]+','+,line[1]+','....+price.text)

Scrape URL provided by CSV

Answers (2)

Related Questions