SMTH
SMTH

Reputation: 95

Trouble collecting some values from a webpage using requests

I'm trying to fetch some dynamic values out of a table from a webpage. This image represents the values I wish to grab from that page. There should be any way to grab them using requests. To let you know, I looked for any hidden api in dev tools and also went through the script tags in page source to find out the values but I could not.

This is the site url

This is the expected output I'm after.

This is I've written so far:

import requests
from bs4 import BeautifulSoup

url = "https://www.dailyfx.com/sentiment"

headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; ) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'}

r = requests.get(url,headers=headers)
soup = BeautifulSoup(r.text,"lxml")
for items in soup.select(".dfx-technicalSentimentCard__barContainer"):
    data = [item.get("data-value") for item in items.select("[data-type='long-value-info'],[data-type='short-value-info']")]
    print(data)

The above script produces empty output like below:

['--', '--']
['--', '--']
['--', '--']
['--', '--']
['--', '--']
['--', '--']
['--', '--']

How can I get the values from that table using requests?

Upvotes: 1

Views: 292

Answers (2)

Roman
Roman

Reputation: 1933

Since the content load dynamically you have to use selenium to collect required information

import time

from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys

chrome_options = Options()
chrome_options.add_argument("--window-size=1920x1080")
chrome_options.add_argument("--headless")

path_to_chromedriver = 'chromedriver'
driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=path_to_chromedriver)

driver.get('https://www.dailyfx.com/sentiment')

driver.find_element_by_tag_name('body').send_keys(Keys.PAGE_DOWN)
time.sleep(5)
driver.find_element_by_tag_name('body').send_keys(Keys.PAGE_DOWN)

soup = BeautifulSoup(driver.page_source, "lxml")
for items in soup.select(".dfx-technicalSentimentCard__barContainer"):
    data = [item.get("data-value") for item in items.select("[data-type='long-value-info'],[data-type='short-value-info']")]
    print(data)

driver.quit()

For this code we can see the following output:

['43', '57']
['53', '47']
['38', '62']
['56', '44']
['57', '43']
['39', '61']
['48', '52']
['77', '23']
['41', '59']
['55', '45']
['56', '44']
['74', '26']
['65', '35']
['87', '13']
['55', '45']
['32', '68']
['43', '57']
['45', '55']
['64', '36']
['56', '44']
['84', '16']
['86', '14']
['97', '3']
['90', '10']

Upvotes: 8

hkgyyf
hkgyyf

Reputation: 60

Your code has no problem. The problem is the figures are dynamic. If you check the page source, you cannot find those numbers but "--" only.

item_list = soup.find_all(attrs={"class":"dfx-technicalSentimentCard__barContainer"}) print(item_list[-1])

<div class="dfx-technicalSentimentCard__barContainer">
<div class="dfx-sentimentPercentageBar dfx-sentimentPercentageBar--textHidden dfx-technicalSentimentCard__bar">
<div class="dfx-sentimentPercentageBar__long font-weight-bold" data-market-id="LTCUSD" data-stream-type="sentiment" data-type="long-bar" data-value="--">
</div>
<div class="dfx-sentimentPercentageBar__short font-weight-bold" data-market-id="LTCUSD" data-stream-type="sentiment" data-type="short-bar" data-value="--">
</div>
</div>
<div class="dfx-technicalSentimentCard__netLongContainer">
<span class="dfx-technicalSentimentCard__netLongText">Net Long</span>
<span class="dfx-rateDetail__percentageInfoText font-weight-bold" data-market-id="LTCUSD" data-stream-type="sentiment" data-type="long-value-info" data-value="--"></span>
</div>
<div class="dfx-technicalSentimentCard__netShortContainer">
<span class="dfx-technicalSentimentCard__netShortText">Net Short</span>
<span class="dfx-rateDetail__percentageInfoText font-weight-bold" data-market-id="LTCUSD" data-stream-type="sentiment" data-type="short-value-info" data-value="--"></span>
</div>
</div>

So you should switch to Selenium to get those figures

Upvotes: 0

Related Questions