Reputation: 103
I am trying to retrieve JSON data from the rest API SWAPI which has information about people
, films
, starships
and planets
within the StarWars universe.
Here is my code:
total_results = []
for page_num in range(1, 7):
# Build the URL and download the results
url = "https://swapi.dev/api/people/?page=" + str(page_num)
print("Downloading", url)
response = requests.get(url)
data = response.json()
total_results = total_results + data['results']
print("We have", len(total_results), "total results")
SW_people_df = pd.json_normalize(total_results)
SW_people_df.head()
Here is how the dataframe looks like:
name | height | mass | hair_color | skin_color | eye_color | birth_year | gender | species | url | |
---|---|---|---|---|---|---|---|---|---|---|
0 | Luke Skywalker | 172 | 77 | blond | fair | blue | 19BBY | male | [] | http://swapi.dev/api/people/1/ |
1 | C-3PO | 167 | 75 | n/a | gold | yellow | 112BBY | n/a | ['http://swapi.dev/api/species/2/'] | http://swapi.dev/api/people/2/ |
2 | R2-D2 | 96 | 32 | n/a | white, blue | red | 33BBY | n/a | ['http://swapi.dev/api/species/2/'] | http://swapi.dev/api/people/3/ |
3 | Darth Vader | 202 | 136 | none | white | yellow | 41.9BBY | male | [] | http://swapi.dev/api/people/4/ |
4 | Leia Organa | 150 | 49 | brown | light | brown | 19BBY | female | [] | http://swapi.dev/api/people/5/ |
Is it possible to retrieve the data from the API including the nested links? i.e. getting the actual JSON information from the nested links in the column SW_people_df['species']
instead of a list of links.
Thank you!
Upvotes: 0
Views: 563
Reputation: 31236
Interesting requirement
import requests
import pandas as pd
# people - pages 1 to 7
dfp = pd.concat([pd.json_normalize(requests.get(f"https://swapi.dev/api/people/?page={p}").json()["results"]) for p in range(1,7)])
# get all the related data from urls against ppl
linkeddf = {c:pd.concat([
pd.json_normalize(requests.get(u).json()) for u in dfp[c].explode().dropna().unique()
]) for c in dfp.columns if dfp[c].explode().str.contains("http").any() and c!="url"}
# join ppl to homeworld
dfp.merge(linkeddf["homeworld"], left_on="homeworld", right_on="url", suffixes=("_person","_world"))
# what films has a skywalker been in?
(dfp.explode("films").merge(linkeddf["films"], left_on="films", right_on="url", suffixes=("_person","_film"))
.loc[:,["name","title"]]
.query("name.str.contains('Sky')")
)
name title
0 Luke Skywalker A New Hope
17 Luke Skywalker The Empire Strikes Back
33 Luke Skywalker Return of the Jedi
53 Luke Skywalker Revenge of the Sith
61 Anakin Skywalker Revenge of the Sith
79 Anakin Skywalker The Phantom Menace
94 Shmi Skywalker The Phantom Menace
115 Anakin Skywalker Attack of the Clones
123 Shmi Skywalker Attack of the Clones
Upvotes: 1