Reputation: 31
I am trying to extract real time price data of stocks from Yahoo Finance. This information is contain in a "span" tag with a "class" and "data-reactid". I am unable to extract the information out of this span tag.
When I enter my code, I don't get any output nor do I get any errors.
I have tried almost all the other answers to this question, but none have worked for me.
<--HTML Code-->
<span class="Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)" data-reactid="34">197.00</span>
#Python Script
my_url = "https://finance.yahoo.com/quote/AAPL?p=AAPL&.tsrc=fin-srch"
u_client = u_req(my_url)
page_html = u_client.read()
u_client.close()
page_soup = soup(page_html, "html.parser")
container = page_soup.find('span', {"data-reactid":'34'})
I would like to get the output of "197.00" (real time price of the stock) as the output.
Upvotes: 3
Views: 1667
Reputation: 22440
You can fetch that in number of ways. Here is one of them:
import requests
from bs4 import BeautifulSoup
res = requests.get('https://finance.yahoo.com/quote/AAPL')
soup = BeautifulSoup(res.text, 'lxml')
price = soup.select_one('#quote-market-notice').find_all_previous()[2].text
print(price)
Another way:
price = soup.select_one("[class*='smartphone_Mt'] span").text
print(price)
Upvotes: 3
Reputation: 295
I opened the URL in chrome and pressed F12. Clicking on the network tab revealed this query from the page: https://query1.finance.yahoo.com/v8/finance/chart/AAPL?region=US&lang=en-US&includePrePost=false&interval=2m&range=1d&corsDomain=finance.yahoo.com&.tsrc=finance
I would suggest exploring the underlying AJAX calls which appear to present a nicely formatted JSON result and looking at the URL a number of params you can modify.
Upvotes: 0
Reputation: 84465
Given that data-reactid can change I would use a unique class to select. Selecting by class is also faster.
import requests
from bs4 import BeautifulSoup as bs
r = requests.get('https://finance.yahoo.com/quote/AAPL/')
soup = bs(r.content, 'lxml')
print(soup.select_one('.Mb\(-4px\)').text)
Upvotes: 0
Reputation: 11
Somehow the data-reactid is changed to 14 when reading the url.
page_soup = soup(page_html, "html.parser")
container = page_soup.find('span', {"data-reactid":'14'})
if container:
print(container.text)
Upvotes: 1