Reputation: 1049
I try to read data from this div tag from website.
<div class="Bgc($lv2BgColor) Bxz(bb) Ovx(a) Pos(r) Maw($newGridWidth) Miw($minGridWidth) Miw(a)!--tab768 Miw(a)!--tab1024 Mstart(a) Mend(a) Px(20px) Py(10px) D(n)--print">
from bs4 import BeautifulSoup
import requests
import re
from urllib.request import urlopen
url = "https://finance.yahoo.com/"
urlpage=urlopen(url).read()
bswebpage=BeautifulSoup(urlpage)
t = bswebpage.find_all("div",{'class':"Bgc($lv2BgColor) Bxz(bb) Ovx(a) Pos(r) Maw($newGridWidth) Miw($minGridWidth) Miw(a)!--tab768 Miw(a)!--tab1024 Mstart(a) Mend(a) Px(20px) Py(10px) D(n)--print"})
print(t)
I use findall with BeautifulSoup but output not show anything. It show only this
[]
How to fix it?
Upvotes: 0
Views: 82
Reputation: 16187
It's mostlikely that the urlopen isn't working properly here and element selection may be a little bit incorrect way. However, the below solution is working fine.
from bs4 import BeautifulSoup
import requests
url = "https://finance.yahoo.com/"
res = requests.get(url)
#print(res)
bswebpage=BeautifulSoup(res.text,'lxml')
t = [x.get_text(' ',strip=True) for x in bswebpage.select('div[class="Carousel-Mask Pos(r) Ov(h) market-summary M(0) Pos(r) Ov(h) D(ib) Va(t)"] > ul > li h3')]
print(t)
Output:
['S&P 500 4,085.17 -32.69 (-0.79%)', 'Dow 30 33,706.91 -242.10 (-0.71%)', 'Nasdaq 11,799.67 -110.85 (-0.93%)', 'Russell 2000 1,918.40 -24.20 (-1.25%)', 'Crude Oil 77.79 -0.68 (-0.87%)', 'Gold 1,873.10 -17.60 (-0.93%)']
Upvotes: 1
Reputation: 694
You could get the parent of that div
instead, since it has an id, which is unique by design. Then, since that div
has just one kid, the element you're looking for, its as simple as getting the element's kid:
t = bswebpage.find("div",{'id': 'Lead-3-FinanceHeader-Proxy'}).div
print(t)
Upvotes: 0