Henk ter Kuile
Henk ter Kuile

Reputation: 3

How to make python loop through array of urls and write data per row in csv?

I have a set of urls (stock data) for which I want certain data to be put into a csv. Per row I need to have:

name price recrat opinion

A csv appears but has no data, and I get the error:

ValueError: too many values to unpack

How should I go about this? Here is my code so far:

# -*- coding: utf-8 -*-
import urllib2
from bs4 import BeautifulSoup
import csv
from datetime import datetime

quote_page = ['http://uk.mobile.reuters.com/business/quotes/overview/AALB.AS',
   'http://uk.mobile.reuters.com/business/stocks/overview/ABNd.AS',
   'http ://uk.mobile.reuters.com/business/stocks/overview/ACCG.AS', 
   'http ://uk.mobile.reuters.com/business/stocks/overview/AD.AS']


for link in quote_page:
    try:
        page = urllib2.urlopen(link)
        soup = BeautifulSoup(page, 'html.parser')

        name_box = soup.find('span', attrs={'class': 'company-name'})
        name = name_box.text.strip()
        print name

        price_box = soup.find('span', attrs={'class':'price'})
        price = price_box.text.strip()
        print price

        recrating_box = soup.find('div', attrs={'class':'recommendation-rating'})
        recrat = recrating_box.text.strip()
        print recrat

        opinion = soup.find('div', attrs={'class':'recommendation-marker'})['style']
        print opinion
    except TypeError:
        continue

quote_page.append((name, price, recrat, opinion))   
    # open a csv file with append, so old data will not be erased
with open('index.csv', 'a') as csv_file:
    writer = csv.writer(csv_file)
    for name, price in quote_page:
        writer.writerows([name, price, recrat, opinion, datetime.now()])

Upvotes: 0

Views: 886

Answers (1)

Adders
Adders

Reputation: 665

Tested and working:

# -*- coding: utf-8 -*-
import urllib2
from bs4 import BeautifulSoup
import csv
from datetime import datetime

quote_page = ['http://uk.mobile.reuters.com/business/quotes/overview/AALB.AS',
   'http://uk.mobile.reuters.com/business/stocks/overview/ABNd.AS',
   'http://uk.mobile.reuters.com/business/stocks/overview/ACCG.AS', 
   'http://uk.mobile.reuters.com/business/stocks/overview/AD.AS']

results = []

for link in quote_page:
    try:
        page = urllib2.urlopen(link)
        soup = BeautifulSoup(page, 'html.parser')

        name_box = soup.find('span', attrs={'class': 'company-name'})
        name = name_box.text.strip()
        print name

        price_box = soup.find('span', attrs={'class':'price'})
        price = price_box.text.strip()
        print price

        recrating_box = soup.find('div', attrs={'class':'recommendation-rating'})
        recrat = recrating_box.text.strip()
        print recrat

        opinion = soup.find('div', attrs={'class':'recommendation-marker'})['style']
        print opinion
    except TypeError:
        continue

    results.append((name, price, recrat, opinion))   

# open a csv file with append, so old data will not be erased
with open('index.csv', 'w') as csv_file:
    writer = csv.writer(csv_file)
    for item in results:
        writer.writerow([item[0], item[1], item[2], item[3], datetime.now()])

There were 3 issues, first, you were overwriting an active list - Not a good idea: I renamed this to results.

Second, you were trying to iterate over the list but accessing only 2 of the 4 items. I've done these as indexed.

Finally, as you were iterating, you'd want to do it line by line so writerows needs to be changed to writerow.

Upvotes: 1

Related Questions