Reputation: 169
I'm having trouble with the following code, it's suppose to print the stock prices by accessing yahoo finance but I can't figure out why its returning empty strings?
import urllib
import re
symbolslist = ["aapl","spy", "goog","nflx"]
i = 0
while i < len(symbolslist):
url = "http://finance.yahoo.com/q?s="+symbolslist[i]+"&q1=1"
htmlfile = urllib.urlopen(url)
htmltext = htmlfile.read()
regex = '<span id="yfs_l84_' + symbolslist[i] + '">(.+?)</span>'
pattern = re.compile(regex)
price = re.findall(pattern,htmltext)
print price
i+=1
Edit: It works fine now, it was a syntax error. Edited the code above as well.
Upvotes: 0
Views: 1941
Reputation: 15058
These are just a few helpful tips for python development (and scraping):
The python requests library is excellent at simplifying the requests process.
while
loopfor
loops are really useful in this situation.
symbolslist = ["aapl","spy", "goog","nflx"]
for symbol in symbolslist:
# Do logic here...
import requests
import lxml
url = "http://www.google.co.uk/finance?q="+symbol+"&q1=1"
r = requests.get(url)
xpath = '//your/xpath'
root = lxml.html.fromstring(r.content)
Compiling regex's takes time and effort. You can abstract these out of your loop.
regex = '<span id="yfs_l84_' + symbolslist[i] + '">(.+?)</span>'
pattern = re.compile(regex)
for symbol in symbolslist:
# do logic
As mentioned in the comment by drewk
both Pandas and Matplot have native functions to get Yahoo quotes or you can use the ystockquote library to scrape from Yahoo. This is used like so:
#!/bin/env python
import ystockquote
symbolslist = ["aapl","spy", "goog","nflx"]
for symbol in symbolslist:
print (ystockquote.get_price(symbol))
Upvotes: 1