Reputation: 223
import requests
from bs4 import BeautifulSoup
request = requests.get("http://www.lolesports.com/en_US/worlds/world_championship_2016/standings/default")
content = request.content
soup = BeautifulSoup(content, "html.parser")
team_name = soup.findAll('text', {'class': 'team-name'})
print(team_name)
I'm trying to parse data from url:"http://www.lolesports.com/en_US/worlds/world_championship_2016/standings/default". Under <text class="team-name">SK Telecom T1</text>
are the individual team names. What I am trying to do is parse that data (SK Telecom T1) and print it to the screen but instead I get [] a empty list. What am I doing wrong?
Upvotes: 2
Views: 177
Reputation: 180391
You don't need selenium, all the dynamic content can be retrieved in json format with a simple get request to http://api.lolesports.com/api/v1/leagues
:
import requests
data = requests.get("http://api.lolesports.com/api/v1/leagues?slug=worlds").json()
Which gives you a whole lot of data, what you want seems to be all under data["teams"]
. A snippet of which is:
[{'id': 2, 'slug': 'bangkok-titans', 'name': 'Bangkok Titans', 'teamPhotoUrl': 'http://na.lolesports.com/sites/default/files/BKT_GPL.TMPROFILE_0.png', 'logoUrl': 'http://assets.lolesports.com/team/bangkok-titans-597g0x1v.png', 'acronym': 'BKT', 'homeLeague': 'urn:rg:lolesports:global:league:league:12', 'altLogoUrl': None, 'createdAt': '2014-07-17T18:34:47.000Z', 'updatedAt': '2015-09-29T16:09:36.000Z', 'bios': {'en_US': 'The Bangkok Titans are the undisputed champions of Thailand’s League of Legends esports scene. They achieved six consecutive 1st place finishes in the Thailand Pro League from 2014 to 2015. However, they aren’t content with just domestic domination.
Each team is listed in the list if dicts:
In [1]: import requests
In [2]: data = requests.get("http://api.lolesports.com/api/v1/leagues?slug=worlds").json()
In [3]: for d in data["teams"]:
...: print(d["name"])
...:
Bangkok Titans
ahq e-Sports Club
SK Telecom T1
TSM
Fnatic
Cloud9
Counter Logic Gaming
H2K
Edward Gaming
INTZ e-Sports
paiN Gaming
Origen
LGD Gaming
Invictus Gaming
Royal Never Give Up
Flash Wolves
Splyce
Samsung Galaxy
KT Rolster
ROX Tigers
G2 Esports
I May
Albus NoX Luna
Upvotes: 2
Reputation: 1856
The website depends on javascript to load. Requests does not interpret JS and thus it will not be able to parse the data.
For websites like this you will be better with Selenium. It uses Firefox(or another driver) as interpreter for the whole website including JS.
Upvotes: 2