Reputation: 25
I am trying to extract the "Balance" integer value from this webpage but am having trouble figuring out how to isolate that list item.
This is the code I currently have:
import bs4, requests
res = requests.get('https://live.blockcypher.com/btc/address/3CpfD1gBBdNW7orErj3YyNNSVpzndZ9aP9/')
res.raise_for_status()
soup = bs4.BeautifulSoup(res.text, 'html.parser')
elems = [elem for elem in soup.findAll('li') if 'Balance' in str(elem.text)]
print(elems)
However when I run it all I get is a [] instead of the real balance value.
Any ideas on where I am going wrong?
Upvotes: 0
Views: 233
Reputation: 7248
To get the number, you can use this:
balance = soup.find('span', text='Balance').parent.contents[3].strip()
print(balance)
Output:
9.06451275 BTC
Explanation:
soup.find('span', text='Balance')
will get you this <span class="dash-label">Balance</span>
tag.
Using .parent.contents
will give the contents of its parent tag as a list. In that list, the text you want is located in the 3rd index.
>>> for i, content in enumerate(soup.find('span', text='Balance').parent.contents):
... print(i, content)
...
0
1 <span class="dash-label">Balance</span>
2 <br/>
3
9.06451275 BTC
4 <br/>
5
6 <span class="dash-label">
(-0.0500349 BTC unconfirmed)
</span>
7
Upvotes: 1