Reputation: 95
I'm very new to Python and trying to learn by doing small little projects. I'm currently trying to collect some information from various web pages, however, whenever it outputs the scraped data to CSV it only seems to output data from the last URL.
Ideally, I want it to be able to write to the CSV opposed to appending as I just want a CSV with only the latest data from the most recent scrape.
I've had a look through some other queries similar to this on StackOverflow but I'm either not understanding them or they're just not working for me. (Probably the former).
Any help would be greatly appreciated.
import csv
import requests
from bs4 import BeautifulSoup
import pandas as pd
URL = ['URL1','URL2']
for URL in URL:
response = requests.get(URL)
soup = BeautifulSoup(response.content, 'html.parser')
nameElement = soup.find('p', attrs={'class':'name'}).a
nameText = nameElement.text.strip()
priceElement = soup.find('span', attrs={'class':'price'})
priceText = priceElement.text.strip()
columns = [['Name','Price'], [nameText, priceText]]
with open('index.csv', 'w', newline='') as csv_file:
writer = csv.writer(csv_file)
writer.writerows(columns)
Upvotes: 0
Views: 32
Reputation: 143197
You have to open file before for
loop and write every row inside for
loop
URL = ['URL1','URL2']
with open('index.csv', 'w', newline='') as csv_file:
writer = csv.writer(csv_file)
writer.writerow( ['Name','Price'] )
for URL in URL:
response = requests.get(URL)
soup = BeautifulSoup(response.content, 'html.parser')
nameElement = soup.find('p', attrs={'class':'name'}).a
nameText = nameElement.text.strip()
priceElement = soup.find('span', attrs={'class':'price'})
priceText = priceElement.text.strip()
writer.writerow( [nameText, priceText] )
Or you have to create list before for
loop and append()
data to this list
URL = ['URL1','URL2']
columns = [ ['Name','Price'] ]
for URL in URL:
response = requests.get(URL)
soup = BeautifulSoup(response.content, 'html.parser')
nameElement = soup.find('p', attrs={'class':'name'}).a
nameText = nameElement.text.strip()
priceElement = soup.find('span', attrs={'class':'price'})
priceText = priceElement.text.strip()
columns.append( [nameText, priceText] )
with open('index.csv', 'w', newline='') as csv_file:
writer = csv.writer(csv_file)
writer.writerows(columns)
Upvotes: 1