How to webscrape nested divs when elemenets are rendered with react?

Question

Here is a website I'm trying to scrape: https://opyn.co/#/buy

I am trying to grap a div, but for some reason an empty list is returned. There has to be something I am doing wrong. Or perhaps a different approach is needed when dealing withs nested divs and websites running frameworks like react ?

Here is the code:

from bs4 import BeautifulSoup
import requests

url = 'https://opyn.co/#/buy'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36'}

r = requests.get(url, headers=headers)
data = r.text
soup = BeautifulSoup(data, "lxml")
print(soup)

all_p = soup.find_all('div')
print(f"{all_p} | Status code: {r.status_code}")

What are the options, how to get the content of nested divs that are rendered by react?

Mikhail Ilin · Accepted Answer

As advised in comments, you can use Selenium to get dynamic content:

import time
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("user-agent=[Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36]")
chrome_driver = webdriver.Chrome(executable_path="chromedriver.exe", options=chrome_options)
chrome_driver.get("https://opyn.co/#/buy")
time.sleep(2)  # wait to make sure that JS call pulled data that you need
divs = chrome_driver.find_elements_by_css_selector('div')
divs_content = []
for div in divs:
    divs_content.append(div.text)
print(divs_content)

You will need to download an executable driver that suits you: https://pypi.org/project/selenium/ For example that i provided you will need chrome webdriver: https://sites.google.com/a/chromium.org/chromedriver/downloads Just check what version your PC is running, download same version of executable and place it in the same folder as your python application.

How to webscrape nested divs when elemenets are rendered with react?

Answers (1)

Related Questions