Reputation: 3431
I am trying to scrape a web page which has an embedded tweet https://thehill.com/homenews/news/376608-west-virginia-teachers-to-continue-strike-after-state-senate-passes-lower-raise
. When I use inspect element from my browser, it shows the corresponding HTML element to the embedded tweet, but when I search it through page resource or use beautifullSoup.findAll(), they do not return any result. How can I fix this problem?
Upvotes: 0
Views: 181
Reputation: 28565
It's dynamic which means you'll need to use something like Selenium to render the page before pulling it. The link however is in the original html source with part of the tweet, so you could maybe go after that:
import requests
from bs4 import BeautifulSoup
url = 'https://thehill.com/homenews/news/376608-west-virginia-teachers-to-continue-strike-after-state-senate-passes-lower-raise'
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36'}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
tweets = soup.find_all('blockquote',{'class':'twitter-tweet'})
for tweet in tweets:
tweet_link = tweet.find('a')['href']
print (tweet_link)
Upvotes: 1