Reputation: 7
Look I need the price of this cryptocurrency https://dex.guru/token/0x68848e1d1ffd7b38d103106c74220c1ad3494afc-bsc With this code:
import lxml
import requests
from lxml import html
html = requests.get('https://dex.guru/token/0x68848e1d1ffd7b38d103106c74220c1ad3494afc-bsc')
doc = lxml.html.fromstring(html.content)
new_releases = doc.xpath('//div[@class="0.00047061210058486165"]/text()')[0]
print(new_releases)
But I get this error
IndexError: list index out of range
I know it's raising the error because the list is empty, but why is the list empty?
Please help, I am starting with scraping.
Upvotes: 0
Views: 364
Reputation: 786
I find a solution (imperfect one for the moment) :
import cloudscraper
scraper = cloudscraper.create_scraper(delay=15, interpreter='nodejs')
url = "https://api.dex.guru/v2/tokens/price"
json = {"ids":
["0x68848e1d1ffd7b38d103106c74220c1ad3494afc-bsc",
"0xbb4cdb9cbd36b01bd1cbaebf2de08d9173bc095c-bsc"]}
headers = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0"}
resp = scraper.post(url, headers=headers, json=json)
# when it works
print(resp.json())
You need to install 'cloudscraper' package with one js interpreter (here i used nodejs). This code sometimes failed to return data, sometimes return data. I will investigate to find out why such instability is observed.
when it works, it returns:
{'total': 2,
'data': [{'address': '0x68848e1d1ffd7b38d103106c74220c1ad3494afc',
'token_price_usd': 0.0003694899811954059,
'token_price_eth': 5.0304271359669745e-06},
{'address': '0xbb4cdb9cbd36b01bd1cbaebf2de08d9173bc095c',
'token_price_usd': 481.9784105344807,
'token_price_eth': 6.533152208208108}]}
It's possible to build a better code with setting a session and saving temporary cookies generated by cloudflare (read 'cloudflare' doc).
Note that when their official API is released, we will prefer to use it.
Maybe cloudflare ban u if you put that kind of code in a loop without sleep() control.
Upvotes: 1