Reputation: 60
I want to fetch all the user handles present in this link https://practice.geeksforgeeks.org/leaderboard/
This is the code which tried,
import requests
from bs4 import BeautifulSoup
URL = 'https://practice.geeksforgeeks.org/leaderboard/'
def getdata(url):
r = requests.get(url)
return r.text
htmldata = getdata(URL)
soup = BeautifulSoup(htmldata, 'html.parser')
table= soup.find_all('table',{"id":"leaderboardTable"})
print(table[0].find_all('tbody')[1])
print(table[0].find_all('tbody')[1].tr)
Output:
<tbody id="overall_ranking">
</tbody>
None
The code is fetching the table but when i try to print the tr or td tags present in the table it is showing None. I tried another approach also using pandas, the same is happening.
I just want all the user handles present in this table (https://practice.geeksforgeeks.org/leaderboard/)
Any solution for this problem will be will be highly appreciated.
Upvotes: 0
Views: 183
Reputation: 16187
The url is dynamic and beautifulsoup can't render JavaScript but Data is generating from API meaning the website is using API.
import requests
api_url='https://practiceapi.geeksforgeeks.org/api/v1/leaderboard/ranking/?ranking_type=overall&page={page}'
for page in range(1,11):
data=requests.get(api_url.format(page=page)).json()
for handle in data:
print(handle['user_handle'])
Output:
Ibrahim Nash
blackshadows
mb1973
Quandray
akhayrutdinov
saiujwal13083
shivendr7
kirtidee18
mantu_singh
cfwong8
harshvardhancse1934
sgupta9519
sanjay05
samiranroy0407
Maverick_H
sreerammuthyam999
gfgaccount
sushant_a
verma_ji
balkar81199
marius_valentin_dragoi
ishu2001mitra
_tony_stark_01
ta7anas17113011638
yups0608
himanshujainmalpura
yujjwal9700
parthabhunia_04
KshamaGupta
the_coder95
ayush_gupta4
khushbooguptaciv18
aditya dhiman
dilipsuthar00786
adityajain9560
dharmsharma0811
Aegon_Targeryan
1032180422
mangeshagarwal1974
naveedaamir484
raj_271
Pulkit__Sharma__
aroranayan999
surbhi_7
ruchika1004ajmera
cs845418
shadymasum
lonewolf13325
user_1_4_13_19_22
SubhankarMajumdar
Upvotes: 2
Reputation: 1899
You can get this using Selenium.
from selenium import webdriver
driver = webdriver.Chrome(executable_path = "<webdriver path>")
driver.get("https://practice.geeksforgeeks.org/leaderboard/")
user_names = driver.find_elements(by = "xpath", value = "//tbody[@id = 'overall_ranking']/tr/td/a")
user_names = list(map(lambda name:name.text, user_names))
driver.quit()
Upvotes: 1