Adding lists from nested JSON file to pandas dataframe?

Question

I have a Json file that looks like the following. I want to grab the strings of names in the "actors" list and add them to a dataframe (which is empty now, the first item added to the dataframe would be the strings of actor names as rows).


{
    "1": {
        "title": "Exodus: Gods and Kings",
        "url": "https://en.wikipedia.org/wiki/Exodus%3A%20Gods%20and%20Kings",
        "year": "2014",
        "poster": "https://upload.wikimedia.org/wikipedia/en/thumb/c/cd/Exodus2014Poster.jpg/220px-Exodus2014Poster.jpg",
        "actors": [
            "Christian Bale",
            "Joel Edgerton",
            "John Turturro",
            "Aaron Paul",
            "Ben Mendelsohn",
            "Sigourney Weaver",
            "Ben Kingsley"
        ]
    },
...

I have tried using the following python code to do this but I am unsuccesful, I beleive because I am using a function wrong or not using the right function at all. Any suggestions as to what function/method to use?

# Create dataframe from json file
df_json = pd.read_json("movies_metadata.json", encoding='latin-1')

# Create new dataframe with actor names
data = [df.iloc[4]]
df = pd.DataFrame(data)

I strongly beleive that my code is very poor, but have had a hard time finding how to do this when googling.

Tried googling all around, as well as different methods from pandas to add items to dataframes

Naveed · Accepted Answer

# read in the json file
df =pd.read_json('txt.json')


#if you have multiple json records, each will be its own columns
# filter the actor rows and then explode 
df.loc['actors',:].explode()

1       Christian Bale
1        Joel Edgerton
1        John Turturro
1           Aaron Paul
1       Ben Mendelsohn
1     Sigourney Weaver
1         Ben Kingsley
2      2Christian Bale
2       2Joel Edgerton
2       2John Turturro
2          2Aaron Paul
2      2Ben Mendelsohn
2    2Sigourney Weaver
2        2Ben Kingsley
Name: actors, dtype: object

Resetting the index

df.loc['actors',:].explode().reset_index()

    index   actors
0   1   Christian Bale
1   1   Joel Edgerton
2   1   John Turturro
3   1   Aaron Paul
4   1   Ben Mendelsohn
5   1   Sigourney Weaver
6   1   Ben Kingsley
7   2   2Christian Bale
8   2   2Joel Edgerton
9   2   2John Turturro
10  2   2Aaron Paul
11  2   2Ben Mendelsohn
12  2   2Sigourney Weaver
13  2   2Ben Kingsley

Alternate Solution

(df[df.index.isin( ['actors','title'])]
 .T
 .explode('actors')
 .reset_index())


    index   title               actors
0   1   Exodus: Gods and Kings  Christian Bale
1   1   Exodus: Gods and Kings  Joel Edgerton
2   1   Exodus: Gods and Kings  John Turturro
3   1   Exodus: Gods and Kings  Aaron Paul
4   1   Exodus: Gods and Kings  Ben Mendelsohn
5   1   Exodus: Gods and Kings  Sigourney Weaver
6   1   Exodus: Gods and Kings  Ben Kingsley
7   2   Exodus: Gods and Kings  2Christian Bale
8   2   Exodus: Gods and Kings  2Joel Edgerton
9   2   Exodus: Gods and Kings  2John Turturro
10  2   Exodus: Gods and Kings  2Aaron Paul
11  2   Exodus: Gods and Kings  2Ben Mendelsohn
12  2   Exodus: Gods and Kings  2Sigourney Weaver
13  2   Exodus: Gods and Kings  2Ben Kingsley

PS: I expanded your JSON file to hae two records in it

Adding lists from nested JSON file to pandas dataframe?

Answers (2)

Related Questions