Scraping headlines from Yahoo Finance using Python

I am using beautiful soup to extract headlines from this page http://in.finance.yahoo.com/q?s=AAPL but I need headlines for past 3 months i.e from 10 Dec 2013 to 10 March 2014. But I am able to extract only the headlines that are their on this specific page. How to extract the required headlines for any specific company?

Code:

url = 'http://in.finance.yahoo.com/q?s=AAPL'
data = urllib2.urlopen(url)
soup = BeautifulSoup(data)

divs = soup.find('div',attrs={'id':'yfi_headlines'})
div = divs.find('div',attrs={'class':'bd'})
ul = div.find('ul')
lis = ul.findAll('li')
hls = []
for li in lis:
    headlines = li.find('a').contents[0]
    print headlines

Upvotes: 1

Answers (2)

acushner

Reputation: 9946

on http://in.finance.yahoo.com/q?s=AAPL, click on 'more headlines from AAPL'. from there you'll get a link that has a datetime field in it. modify that and you should be good. (http://in.finance.yahoo.com/q/h?s=AAPL&t=2014-02-08T15:06:40+05:30)

Upvotes: 0

Leonardo

Reputation: 2504

I think your problem is more related to where you get your data from, if you need data from the last three months you should query the http://in.finance.yahoo.com/q/hp?s=AAPL instead, where all the data you look for is presented on a table.

Upvotes: 0

Scraping headlines from Yahoo Finance using Python

Answers (2)

Related Questions