Reputation: 897
Does Python3 have a JavaScript based scraping library that is not Selenium? I'm trying to scrape https://www.mailinator.com/v2/inbox.jsp?zone=public&query=test
, but the inbox is loaded with JavaScript. The reason I don't want to use Selenium is I don't want it to open a window when I run it.
Here is my non-working code:
import requests
from bs4 import BeautifulSoup as soup
INBOX = "https://www.mailinator.com/v2/inbox.jsp?zone=public&query={}"
def check_inbox(name):
stuff = soup(requests.get(INBOX.format(name)).text,"html.parser")
print(stuff.find("ul",{"class":"single_mail-body"}))
check_inbox("retep")
Do any such libraries exist?
I couldn't find anything for the Google search python 3 javascript scraper
outside of Selenium.
Upvotes: 2
Views: 880
Reputation: 11943
You don't need javascript actually, because it's client side, so you can emulate it.
If you inspect the webpage (developer tools > network), you'll see that there is a websocket
connection to this :
wss://www.mailinator.com/ws/fetchinbox?zone=public&query=test
Now if you implement a websocket client using python, you'll be able to cleanly fetch your mails (see this : https://github.com/aaugustin/websockets/blob/master/example/client.py).
EDIT :
As mentioned by John, augustin's ws client repo is dead. Today I'd use this : https://websockets.readthedocs.io/en/stable/
Upvotes: 1