Reputation: 1
I am trying to scrape a web-page to collect a list of Fortune 500 companies. However, when I run this code, BeautifulSoup can't find <div class="rt-tr-group" role="rowgroup">
tags.
import requests
from bs4 import BeautifulSoup
url = r'https://fortune.com/fortune500/2019/search/'
page = requests.get(url)
soup = BeautifulSoup(page.content, 'lxml')
data = soup.find_all('div', {'class': 'rt-tr-group'})
Instead, I just get an empty list. I've tried changing the parser but saw no results.
The tags exist and can be seen here:
Upvotes: 0
Views: 179
Reputation: 51
Content of your parsing page loading with JS, and you can get empty page with requests.get
.
Upvotes: 0
Reputation: 882
Data is loading on that page using JS, after some time. Using Selenium, you can wait for page to be loaded completely, or try to get data from Javascript.
P.S. You can check for XHR requests and try to get JSON instead, without parsing. Here is one request
Upvotes: 1