user15552005
user15552005

Reputation:

BeautifulSoup not getting the right class

https://etherscan.io/address/0xCcE984c41630878b91E20c416dA3F308855E87E2

I want to scrape the lisbox href next to Token label.

I need to scrape href from

class="link-hover d-flex justify-content-between align-items-center"

so my code:

import requests
from bs4 import BeautifulSoup

page = requests.get('https://etherscan.io/address/0xCcE984c41630878b91E20c416dA3F308855E87E2').text
html = BeautifulSoup(page, 'html.parser')

href = html.find(class_ = 'link-hover d-flex justify-content-between align-items-center')['href']

however the result is nothing. Can anyone help me? I really need some help.

Upvotes: 0

Views: 123

Answers (1)

George Imerlishvili
George Imerlishvili

Reputation: 1957

i think with requests library you can't do it because Cloudflare detects automation.

>>> page = requests.get('https://etherscan.io/address/0xCcE984c41630878b91E20c416dA3F308855E87E2')
>>> page.status_code
403

The HTTP 403 Forbidden client error status response code indicates that the server understood the request but refuses to authorize it. instead of bs4 try selenium library.
Page title

>>> soup = BeautifulSoup(page.content, 'html.parser')
>>> soup.title
>>> <title>Attention Required! | Cloudflare</title>

Upvotes: 1

Related Questions