Mr. B.
Mr. B.

Reputation: 8697

How to scrape an API response that is requested by the target website?

I'd like to scrape content of a website that is requested asynchronous and not visible in the source code.

How can I await the website's request? I need to sniff its traffic somehow, but couldn't find anything yet.

I'm looking something like that (pseudo code):

import requests
from bs4 import BeautifulSoup

page = requests.get("http://target.tld")
traffic = page.sniff_traffic(seconds=10)
for req in traffic:
    print(req)  # http://api.target.tld

soup = BeautifulSoup(page.content, "html.parser")

Any ideas?

Upvotes: 0

Views: 140

Answers (1)

Birb
Birb

Reputation: 866

You can't do that with BeautifulSoup, you need to use something which mimics a web browser, such as Selenium with Geckodriver.

Upvotes: 1

Related Questions