Reputation:
I am trying to write a threaded Python script which will iterate through a list of urls and open each one in a separate thread.
from BeautifulSoup import BeautifulSoup
from threading import Thread
import mechanize
tickers = ["aapl", "siri", "goog", "intc"]
nextTicker = 0
def quotes(i):
br = mechanize.Browser()
br.addheaders = [('User-agent', 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.10) Gecko/20100914 Firefox/3.6.10')]
r= br.open('http://finance.yahoo.com/q?s=' + tickers[nextTicker])
html = r.read()
soup = BeautifulSoup(html)
price = soup.findAll('span', attrs={"id":"yfs_l10_" + tickers[nextTicker]})
price = price[0].string
print price
for i in range(4):
t = Thread(target=quotes, args=(i,))
t.start()
I know that I need a nextTicker = nextTicker + 1
in there so that each thread will grab a unique ticker symbol from the list named tickers but I am not sure where to put this or how to ensure that each thread is getting a unique url.
Right now when the script runs it just grabs the index 0 item from the list for all four threads. How do I get each thread to grab the next item in the list and append it to my base url?
Upvotes: 0
Views: 3638
Reputation: 4879
Instead of meddling with a nextTicker
variable and having to lock it and so forth, just refer to tickers[i]
. (Or even better, just pass the ticker itself!)
Upvotes: 2
Reputation: 45039
If you want thread specific data, pass it in the arguments.
So use tickers[i] instead of tickers[nextTicker]
Better yet, use
for ticker in tickers:
t = Thread(target = quotes, args = (ticker,) )
Possibly better yet, checkout eventlet. It allows writing code like this but avoids some of the problems with threads.
Upvotes: 3