Reputation: 63
I am trying to access player data from a scrape of the OHL stats page. the code below gives me everything I need,
url = 'https://ontariohockeyleague.com/stats/players/68'
r = requests.get(url)
soup = BeautifulSoup(r.text, 'lxml')
statviewtype_url = "http://lscluster.hockeytech.com/feed/?feed=modulekit&view=statviewtype&type=topscorers&key=%s&fmt=json&client_code=ohl&lang=en&league_code=&season_id=68&first=0&limit=100&sort=active&stat=all&order_direction="
key = soup.find('div', id='stats')['data-feed_key']
r = requests.get(statviewtype_url % key)
statviewtype_data = json.loads(r.text)
d = json.loads(r.text)
df = pd.DataFrame(d)
print(df)
output:
Copyright {'required_copyright': 'Official statistics pr...
Parameters {'feed': 'modulekit', 'view': 'statviewtype', ...
Statviewtype [{'player_id': '7889', 'shortname': 'M. Rossi'...
What I want is the Statviewtype dictionary (? I assume thats what it is). But when i try to access it with something like print(df['Statviewtype'])
i get errors. Am i mixing up datatypes? Am I over-simplifying this?
Upvotes: 0
Views: 30
Reputation: 1305
Your dataframe has 3 indexes ['Copyright', 'Parameters', 'Statviewtype']
and one column ['SiteKit']
. In order to read Statviewtype you need:
df.loc['Statviewtype', 'SiteKit'][0]
Which specifies index Statviewtype and column SiteKit. Also note the [0]
at the end. That is because the dictionary is inside a list in which the dictionary is the only element. But once that's done, you're good to go:
In []: df.loc['Statviewtype', 'SiteKit'][0].keys()
Out[]: dict_keys(['player_id', 'shortname', 'first_name', ... , 'namelink', 'teamlink', 'photo'])
Upvotes: 1