Reputation: 36
I appreciate it if someone instruct me how to extract the number '28,050'
I used to get that number by this piece of code (python 3):
import requests
import bs4
res_bonbast = requests.get('https://bonbast.com/')
soup_bonbast = bs4.BeautifulSoup(res_bonbast.text,"lxml")
int(float(soup_bonbast.select('#usd1_top')[0].getText()
But recently they seems to change something
Upvotes: 0
Views: 1201
Reputation: 805
Your problem is that this value is not populated until after the page loads. The HTML for this element is indeed blank as your script is showing you. What happens when you load the site in your browser, and you can confirm this by opening dev tools and looking at the network tab, is that you first get some HTML where this element is blank. Later, there is a call to https://bonbast.com/json which returns the values that are used to populate the element.
What you need to do is make a request to bonbast.com/json yourself and extract the value you want from the json rather than do HTML parsing. The key you are looking for there is usd1.
The bonbast.com/json endpoint may want additional detail in the headers. I captured the curl request below by visiting bonbast.com with my dev tools network tab open (in Chrome, ctrl+shift+i >> network) and finding the request for bonbast.com/json. I then right clicked on it and selected "copy as curl"
curl 'https://bonbast.com/json' \
-H 'authority: bonbast.com' \
-H 'sec-ch-ua: "Chromium";v="95", ";Not A Brand";v="99"' \
-H 'accept: application/json, text/javascript, */*; q=0.01' \
-H 'content-type: application/x-www-form-urlencoded; charset=UTF-8' \
-H 'x-requested-with: XMLHttpRequest' \
-H 'sec-ch-ua-mobile: ?0' \
-H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36' \
-H 'sec-ch-ua-platform: "Linux"' \
-H 'origin: https://bonbast.com' \
-H 'sec-fetch-site: same-origin' \
-H 'sec-fetch-mode: cors' \
-H 'sec-fetch-dest: empty' \
-H 'referer: https://bonbast.com/' \
-H 'accept-language: en-US,en;q=0.9' \
-H 'cookie: st_bb=0; _gid=GA1.2.587414378.1636538685; __gads=ID=2f6e05bb70db575d-2208cfa441cc00d3:T=1636538685:RT=1636538685:S=ALNI_MaKL18-XZaWbbhlmh2h3RGvYmVKRw; _ga_PZF6SDPF22=GS1.1.1636562265.2.0.1636562265.0; _ga=GA1.2.633937873.1636538685; _gat_gtag_UA_35412804_1=1' \
--data-raw 'data=0d7e26d17fde20e86b760b00127132d4%2CfTtTZ%2C2021-11-10-16-38-37&webdriver=false' \
--compressed
The result is:
{ "try1": "2890",
"month": 8,
"emami1": "12450000",
"afn2": "309",
"afn1": "311",
"rub2": "397",
"azadi1_22": "6250000",
"bhd2": "74870",
"azn1": "16730",
"bhd1": "75370",
"azadi1g": "2350000",
"bourse": "1904324.2",
"try2": "2870",
"cny1": "4450",
"cny2": "4430",
"cad1": "22860",
"cad2": "22760",
"jpy1": "2495",
"thb1": "865",
"usd1": "28420",
"usd2": "28320",
"thb2": "860",
"azn2": "16630",
"dkk1": "4400",
"amd2": "590",
"day": 19,
"minute": "41",
"amd1": "595",
"bitcoin": "68616.85",
"hour": "20",
"sar2": "7545",
"rub1": "400",
"azadi1g2": "2250000",
"azadi12": "12000000",
"eur1": "32725",
"eur2": "32575",
"emami12": "12250000",
"second": "45",
"omr1": "73825",
"year": 1400,
"chf2": "30855",
"chf1": "31005",
"azadi1_42": "3700000",
"jpy2": "2485",
"kwd2": "93795",
"kwd1": "94195",
"sek1": "3280",
"gbp2": "38090",
"gbp1": "38290",
"sek2": "3265",
"myr1": "6850",
"myr2": "6820",
"omr2": "73525",
"azadi1": "12350000",
"azadi1_2": "6400000",
"aud2": "20805",
"azadi1_4": "3800000",
"aud1": "20905",
"dkk2": "4380",
"inr2": "380",
"inr1": "382",
"last_modified": "November 10, 2021 16:00",
"aed2": "7715",
"aed1": "7735",
"iqd2": "1935",
"qar1": "7805",
"qar2": "7775",
"iqd1": "1945",
"hkd2": "3620",
"hkd1": "3650",
"sar1": "7575",
"created": "November 10, 2021 00:01",
"sgd2": "20930",
"sgd1": "21030",
"ounce": "1854.31",
"weekday": "Wednesday",
"mithqal": "5416000",
"gol18": "1250288",
"nok1": "3305",
"nok2": "3290"
}
However, bad news for you. The parameters in the curl request seem to expire after a period of time. I believe what's happening is, when you visit the website you are given a cookie. That cookie is what permissions you to make the request to the json endpoint, but it will expire after a short duration.
It will require a small amount of work to reliably scrape this page.
Upvotes: 1