Reputation: 183
Just trying to feed-in links from a .csv file, then scrape info from each link, then write it to other columns in the .csv. I've been scratching my head for days. Can anyone else see what's wrong here? The error happens at soup
def scrape_data(csv_file):
writer = csv.writer(csv_file)
reader = csv.reader(csv_file)
for row in reader:
if row:
# THE ERROR HAPPENS AT THE SOUP OBJECT BELOW
soup = BeautifulSoup(urllib.request.urlopen(row[0], 'lxml'))
post_time = soup.find('time', {'class' : 'date timeago'})
sqfeet = (sqft.text for sqft in soup.find('span', {'class' : 'shared-line-bubble'}))
availability = (soup.find('span', {'class' : 'data-date'}))
attribute_group = (ag.text for ag in soup.find('p', {'class' : 'attrgroup'}))
address = (add.text for add in soup.find('div', {'class' : 'mapaddress'}))
for data in zip(post_time, sqfeet, availability, attribute_group, address):
writer.writerow(row[3])
Upvotes: 0
Views: 79
Reputation: 10959
The 'lxml'
part must be a parameter of BeautifulSoup()
but is parameter of urllib.request.urlopen()
Upvotes: 3