Jason
Jason

Reputation: 25

Can't find the contents of a 'div' using BeautifulSoup

I'm trying to scrape some information about MLB players from the MLB website. However, using urllib2 and BeautifulSoup, I can't find the contents under the 'div'. But I can clearly see the contents on Chrome.

An example is that, going to page (http://mlb.mlb.com/team/player.jsp?player_id=150378). The Status info on the upper right side shows 'Released'. But I can't find this string/content using BS4.

Here's my code:

base_url = 'http://mlb.mlb.com/team/player.jsp?player_id=150378'
request = urllib2.Request(base_url)
response = urllib2.urlopen(request)
soup = BeautifulSoup(response)
player_status = soup.findAll('div',id='player_status')
print player_status

I was expecting it to have a string like 'Status: Released', but the result only shows

[<div id="player_status"></div>]

I have never encountered this problem before. Can someone help me with this? Thanks!!

Upvotes: 2

Views: 1084

Answers (1)

alecxe
alecxe

Reputation: 473863

Player information on the page is coming from the response of an additional XHR request to the JSON API. You can simulate it, for example, using requests:

>>> import requests
>>> 
>>> url = "http://mlb.mlb.com/lookup/json/named.player_info.bam?sport_code=%27mlb%27&player_id=150378"
>>> 
>>> response = requests.get(url)
>>> data = response.json()
>>> data['player_info']['queryResults']['row']['status']
Released

Upvotes: 1

Related Questions