Reputation: 516
I am webscraping data on two json files.
The first one has some data that I can collect.
The second one does not have the desired data. And I want to store 'NA' instead.
My problem is that I don't know how to store correctly my 'NA' within my script.
Here is my code:
import requests
# this is our profile ids
profile=['kaid_896965538702696832878421','kaid_1143236333220233567674383']
# prepare the list to get data
badgechall=[]
# do this for each profile id
for kaid in profile:
# request the api link of the profile
data = requests.get('https://www.khanacademy.org/api/internal/user/{}/profile/widgets?lang=en&_=190424-1429-bcf153233dc9_1556201931959'.format(kaid)).json()
# go through each json file to get the data
for item in data:
# try to find on each dictionary of the list the desired data or pass
try:
for badges in item['renderData']['badgeCountData']['counts']:
if badges['typeLabel'] == 'Challenge Patches':
badgechall.append(badges['count'])
except KeyError:
pass
print(badgechall)
When I run this code, I get:
[100]
What I would like to get is this:
[100, 'NA']
'100'
corresponding to the first profile 'kaid_896965538702696832878421'
and 'NA'
corresponding to the second profile 'kaid_1143236333220233567674383'
.
I would like to have the data for the first and second links and if there is none return 'NA'
. So we should have a list with only 2 values.
I tried:
except KeyError:
badgechall.append('NA')
pass
But it returns:
[100, 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA']
Upvotes: 0
Views: 71
Reputation: 82899
You could define a function and from that function return the first count, or "NA"
.
def get_badge_count(data, badge='Challenge Patches'):
for item in data:
try:
for badges in item['renderData']['badgeCountData']['counts']:
if badges['typeLabel'] == badge:
return badges['count']
except KeyError:
pass
return "NA"
for kaid in profile:
data = requests.get('https://www.khanacademy.org/api/internal/user/{}/profile/widgets?lang=en&_=190424-1429-bcf153233dc9_1556201931959'.format(kaid)).json()
badgechall.append(get_badge_count(data))
Afterwards, badgechall
is [100, 'NA']
. If you want to match another tag, you could provide it as a parameter, e.g. get_badge_count(data, 'Sun Patches')
Upvotes: 1
Reputation: 39374
Did you mean that you want to break out of the for loop?
except KeyError:
badgechall.append('NA')
break
Upvotes: 0