Reputation: 363
I'm practicing some web scraping and for this project I'm scraping this website: https://assetdash.com/?all=true
I'm getting and parsing the HTML code as following.
my_url = 'https://assetdash.com/?all=true'
client = urlopen(my_url)
page_html = client.read()
client.close()
soup = BeautifulSoup(page_html, 'html.parser')
rows = soup.findAll("tr", {"class":"Table__Tr-sc-1pfmqa-5 gNrtPb"})
print(len(rows))
This returns a length of 0 whereas it should be returning a much higher value. Have I done something wrong with the parsing or am I retrieving the rows incorrectly?
Upvotes: 1
Views: 62
Reputation: 28565
It dynamic and javascript rendered. Go straight to the source of the data.
Code:
import requests
my_url = 'https://assetdash.herokuapp.com/assets?currentPage=1&perPage=200&typesOfAssets[]=Stock&typesOfAssets[]=ETF&typesOfAssets[]=Cryptocurrency'
data = requests.get(my_url).json()
df = pd.DataFrame(data['data'])
Output:
print (df)
id ticker ... peRatio rank
0 60 AAPL ... 35.17 1
1 2287 MSFT ... 34.18 2
2 251 AMZN ... 91.52 3
3 1527 GOOGL ... 33.79 4
4 1276 FB ... 31.09 5
.. ... ... ... ... ...
195 537 BMWYY ... 15.06 196
196 3756 WBK ... 35.57 197
197 1010 DG ... 23.40 198
198 1711 HUM ... 12.77 199
199 1194 EQNR ... -15.82 200
[200 rows x 13 columns]
Upvotes: 1