Reputation: 393
I am able to extract some data from a url but I am still missing some data.
import requests
from bs4 import BeautifulSoup
import time
from selenium import webdriver
driver = webdriver.Chrome('chromedriver.exe')
url = 'https://poocoin.app/tokens/0xe56842ed550ff2794f010738554db45e60730371'
driver.get(url)
time.sleep(8)
soup = BeautifulSoup(driver.page_source, 'lxml')
data = soup.find('div', class_='overflow-auto unpad-3 ps-3').get_text()
print (data)
Current Output:
Pc v2 | BIN/BNB LP Holdings: 4,694.84 BNB ($2,221,326) | Chart | Holders
Pc v2 | BIN/BUSD LP Holdings: 0.03 BUSD ($0) | Chart | Holders
Pc v2 | BIN/USDT LP Holdings: 0.00 USDT ($0) | Chart | Holders
Wanted Output:
Pc v2 | BIN/BNB LP Holdings: 4,697.12 BNB ($2,226,112)
| Chart https://bscscan.com/token/0xbb4cdb9cbd36b01bd1cbaebf2de08d9173bc095c?a=0xe432afB7283A08Be24E9038C30CA6336A7cC8218#tokenAnalytics
| Holders https://bscscan.com/token/0xe432afB7283A08Be24E9038C30CA6336A7cC8218#balances
Pc v2 | BIN/BUSD LP Holdings: 0.03 BUSD ($0)
| Chart https://bscscan.com/token/0xe9e7cea3dedca5984780bafc599bd69add087d56?a=0x61ca44133a0984EF96E2358947463C41837CaD50#tokenAnalytics
| Holders https://bscscan.com/token/0x61ca44133a0984EF96E2358947463C41837CaD50#balances
Pc v2 | BIN/USDT LP Holdings: 0.00 USDT ($0)
| Chart https://bscscan.com/token/0x55d398326f99059ff775485246999027b3197955?a=0x9eb614F1c85414328EdAA1508C626993d45B1453#tokenAnalytics
| Holders https://bscscan.com/token/0x9eb614F1c85414328EdAA1508C626993d45B1453#balances
Upvotes: 0
Views: 105
Reputation: 3433
Try this once:
soup = BeautifulSoup(driver.page_source,'html5lib')
rows = soup.find_all('div', class_='text-xs my-3')
for row in rows:
data = row.get_text()
chart = "Chart: {}".format(row.find('a',text=['Chart']).attrs['href'])
holder = "Holders: {}".format(row.find('a',text=['Holders']).attrs['href'])
print(data)
print(chart)
print(holder)
Output:
Pc v2 | BIN/BNB LP Holdings:4,708.86 BNB ($2,239,013) | Chart | Holders
Chart: https://bscscan.com/token/0xbb4cdb9cbd36b01bd1cbaebf2de08d9173bc095c?a=0xe432afB7283A08Be24E9038C30CA6336A7cC8218#tokenAnalytics
Holders: https://bscscan.com/token/0xe432afB7283A08Be24E9038C30CA6336A7cC8218#balances
Pc v2 | BIN/BUSD LP Holdings:0.03 BUSD ($0) | Chart | Holders
Chart: https://bscscan.com/token/0xe9e7cea3dedca5984780bafc599bd69add087d56?a=0x61ca44133a0984EF96E2358947463C41837CaD50#tokenAnalytics
Holders: https://bscscan.com/token/0x61ca44133a0984EF96E2358947463C41837CaD50#balances
Pc v2 | BIN/USDT LP Holdings:0.00 USDT ($0) | Chart | Holders
Chart: https://bscscan.com/token/0x55d398326f99059ff775485246999027b3197955?a=0x9eb614F1c85414328EdAA1508C626993d45B1453#tokenAnalytics
Holders: https://bscscan.com/token/0x9eb614F1c85414328EdAA1508C626993d45B1453#balances
Upvotes: 1
Reputation: 3400
In one line output use find_all
method on a
tag and put text to get specific links
all_links=[ i['href'] for i in soup.find('div', class_='overflow-auto unpad-3 ps-3').find_all("a",text=['Chart','Holders'])]
Output:
['https://bscscan.com/token/0xbb4cdb9cbd36b01bd1cbaebf2de08d9173bc095c?a=0xe432afB7283A08Be24E9038C30CA6336A7cC8218#tokenAnalytics',
'https://bscscan.com/token/0xe432afB7283A08Be24E9038C30CA6336A7cC8218#balances',
'https://bscscan.com/token/0xe9e7cea3dedca5984780bafc599bd69add087d56?a=0x61ca44133a0984EF96E2358947463C41837CaD50#tokenAnalytics',
'https://bscscan.com/token/0x61ca44133a0984EF96E2358947463C41837CaD50#balances',
'https://bscscan.com/token/0x55d398326f99059ff775485246999027b3197955?a=0x9eb614F1c85414328EdAA1508C626993d45B1453#tokenAnalytics',
'https://bscscan.com/token/0x9eb614F1c85414328EdAA1508C626993d45B1453#balances']
As per your requireent:
data=soup.find('div', class_='overflow-auto unpad-3 ps-3').find_all("div",class_="text-xs my-3")
for i in data:
print(i.find("a",attrs={"target":"_blank"}).get_text(),end="")
print(" ".join(i.find("a").find_next_siblings(text=True)[:2]),end="")
print(i.find("span").get_text())
links=[i.get_text() +" "+ i['href'] for i in i.find_all("a",text=['Chart','Holders'])]
print(*links,sep="\n")
Output:
Pc v2 | BIN/BNB LP Holdings: 4,716.76 BNB ($2,234,449)
Chart https://bscscan.com/token/0xbb4cdb9cbd36b01bd1cbaebf2de08d9173bc095c?a=0xe432afB7283A08Be24E9038C30CA6336A7cC8218#tokenAnalytics
Holders https://bscscan.com/token/0xe432afB7283A08Be24E9038C30CA6336A7cC8218#balances
Pc v2 | BIN/BUSD LP Holdings: 0.03 BUSD ($0)
Chart https://bscscan.com/token/0xe9e7cea3dedca5984780bafc599bd69add087d56?a=0x61ca44133a0984EF96E2358947463C41837CaD50#tokenAnalytics
Holders https://bscscan.com/token/0x61ca44133a0984EF96E2358947463C41837CaD50#balances
Pc v2 | BIN/USDT LP Holdings: 0.00 USDT ($0)
Chart https://bscscan.com/token/0x55d398326f99059ff775485246999027b3197955?a=0x9eb614F1c85414328EdAA1508C626993d45B1453#tokenAnalytics
Holders https://bscscan.com/token/0x9eb614F1c85414328EdAA1508C626993d45B1453#balances
Upvotes: 2