Reputation: 95
I'm trying to fetch some dynamic values out of a table from a webpage. This image represents the values I wish to grab from that page. There should be any way to grab them using requests. To let you know, I looked for any hidden api in dev tools and also went through the script tags in page source to find out the values but I could not.
This is the site url
This is the expected output I'm after.
This is I've written so far:
import requests
from bs4 import BeautifulSoup
url = "https://www.dailyfx.com/sentiment"
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; ) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'}
r = requests.get(url,headers=headers)
soup = BeautifulSoup(r.text,"lxml")
for items in soup.select(".dfx-technicalSentimentCard__barContainer"):
data = [item.get("data-value") for item in items.select("[data-type='long-value-info'],[data-type='short-value-info']")]
print(data)
The above script produces empty output like below:
['--', '--']
['--', '--']
['--', '--']
['--', '--']
['--', '--']
['--', '--']
['--', '--']
How can I get the values from that table using requests?
Upvotes: 1
Views: 292
Reputation: 1933
Since the content load dynamically you have to use selenium to collect required information
import time
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
chrome_options = Options()
chrome_options.add_argument("--window-size=1920x1080")
chrome_options.add_argument("--headless")
path_to_chromedriver = 'chromedriver'
driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=path_to_chromedriver)
driver.get('https://www.dailyfx.com/sentiment')
driver.find_element_by_tag_name('body').send_keys(Keys.PAGE_DOWN)
time.sleep(5)
driver.find_element_by_tag_name('body').send_keys(Keys.PAGE_DOWN)
soup = BeautifulSoup(driver.page_source, "lxml")
for items in soup.select(".dfx-technicalSentimentCard__barContainer"):
data = [item.get("data-value") for item in items.select("[data-type='long-value-info'],[data-type='short-value-info']")]
print(data)
driver.quit()
For this code we can see the following output:
['43', '57']
['53', '47']
['38', '62']
['56', '44']
['57', '43']
['39', '61']
['48', '52']
['77', '23']
['41', '59']
['55', '45']
['56', '44']
['74', '26']
['65', '35']
['87', '13']
['55', '45']
['32', '68']
['43', '57']
['45', '55']
['64', '36']
['56', '44']
['84', '16']
['86', '14']
['97', '3']
['90', '10']
Upvotes: 8
Reputation: 60
Your code has no problem. The problem is the figures are dynamic. If you check the page source, you cannot find those numbers but "--" only.
item_list = soup.find_all(attrs={"class":"dfx-technicalSentimentCard__barContainer"}) print(item_list[-1])
<div class="dfx-technicalSentimentCard__barContainer">
<div class="dfx-sentimentPercentageBar dfx-sentimentPercentageBar--textHidden dfx-technicalSentimentCard__bar">
<div class="dfx-sentimentPercentageBar__long font-weight-bold" data-market-id="LTCUSD" data-stream-type="sentiment" data-type="long-bar" data-value="--">
</div>
<div class="dfx-sentimentPercentageBar__short font-weight-bold" data-market-id="LTCUSD" data-stream-type="sentiment" data-type="short-bar" data-value="--">
</div>
</div>
<div class="dfx-technicalSentimentCard__netLongContainer">
<span class="dfx-technicalSentimentCard__netLongText">Net Long</span>
<span class="dfx-rateDetail__percentageInfoText font-weight-bold" data-market-id="LTCUSD" data-stream-type="sentiment" data-type="long-value-info" data-value="--"></span>
</div>
<div class="dfx-technicalSentimentCard__netShortContainer">
<span class="dfx-technicalSentimentCard__netShortText">Net Short</span>
<span class="dfx-rateDetail__percentageInfoText font-weight-bold" data-market-id="LTCUSD" data-stream-type="sentiment" data-type="short-value-info" data-value="--"></span>
</div>
</div>
So you should switch to Selenium to get those figures
Upvotes: 0