cmillion
cmillion

Reputation: 13

How to grab spot price from yahoo finance using BeautifulSoup

I'm trying to grab the spot price of the SPY ETF: https://finance.yahoo.com/quote/SPY/options

I've mostly tried using soup.find_all, using the nested 'div' tags:

    from bs4 import BeautifulSoup
    import urllib.request

    url = 'https://finance.yahoo.com/quote/SPY/options/'
    source = urllib.request.urlopen(url).read()
    soup = BeautifulSoup(source,'lxml')

    for div in soup.find_all('div', class_ = "My(6px) smartphone_Mt(15px)"):
        print(div.text)

    for div in soup.find_all('div', class_ = "D(ib) Maw(65%) Ov(h)"):
        print(div.text)

    for div in soup.find_all('div', class_ = "D(ib) Mend(20px)"):
        print(div.text)

Nothing is printed. I also tried the following:

    print(soup.find('span', attrs = {'data-reactid':"35"}).text)

which results in 'Last Price' being printed. Now obviously I want the last price, rather than the words 'last price', but this is closer.

Nested in that span tag is some html which includes the number I want. I'm guessing the correct answer has to do with the 'react text: 36' stuff within the span tag (can't type it without stackoverflow thinking I'm trying to actually implement the html into this question).

Upvotes: 1

Views: 795

Answers (2)

Orhan Solak
Orhan Solak

Reputation: 809

I recommend to you use scrapy, requests modules

import requests
from bs4 import BeautifulSoup
from scrapy.selector import Selector

ajanlar = [
'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko)',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko)',
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko)',
'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)',
'Mozilla/5.0 (Windows NT 6.4; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)',
'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)']
url = "https://finance.yahoo.com/quote/SPY/options"

headers = {"User-Agent":random.choice(ajanlar)}
response = requests.get(url,headers=headers,proxies=None)
soup = BeautifulSoup(response.text, 'lxml')

xpath1 = "normalize-space(//div[@class='Mt(6px) smartphone_Mt(15px)'])"
xpath2 = "normalize-space(//div[@class='D(ib) Maw(65%) Maw(70%)--tab768 Ov(h)'])"
xpath3 = "normalize-space(//div[@class='D(ib) Mend(20px)'])"

var1 = Selector(text=response.text).xpath(xpath1).extract()[0]
var2 = Selector(text=response.text).xpath(xpath2).extract()[0]
var3 = Selector(text=response.text).xpath(xpath3).extract()[0]

print(var1)
print(var2)
print(var3)

Outputs:

269.97-1.43 (-0.53%)At close: 4:00PM EST269.61 -0.44 (-0.16%)After hours: 6:08PM ESTPeople also watchDIAIWMQQQXLFGLD
269.97-1.43 (-0.53%)At close: 4:00PM EST269.61 -0.44 (-0.16%)After hours: 6:08PM EST
269.97-1.43 (-0.53%)At close: 4:00PM EST

After than, you could apply regex

Upvotes: 1

Dan-Dev
Dan-Dev

Reputation: 9440

If you just want the price:

import urllib.request
from bs4 import BeautifulSoup, Comment

page = urllib.request.urlopen("https://finance.yahoo.com/quote/SPY?p=SPY")
content = page.read().decode('utf-8')
soup = BeautifulSoup(content, 'html.parser')
comments = soup.findAll(text=lambda text:isinstance(text, Comment))
[comment.extract() for comment in comments]
price = soup.find("span", {"data-reactid": "14", "class" : "Trsdu(0.3s) "}).text
print(price)

Outputs:

271.40

Upvotes: 1

Related Questions