Flatten list of dict (array) as new columns not working with json normalize

Question

Apologies as I'm quite inexperienced with Python and looking for assistance here 🙏

I am trying to flatten a list of dicts into new columns in an existing dataframe, with using key as column names and values as cell content. Now json_normalize is successful in unwrapping the data, but fails to place dict*value*1 as column header, and dict*value*2 as cell content. In the examples I saw in other posts, it did work, but somehow not for me.

The column originally looks like this :

View of the column (1D array) before unwrapping

What I obtain is something like this:

View after unwrapping with json_normalize:

View after unwrapping with json_normalize

Obviously what I'd like to achieve is something like this:

Desired Outpu:

Desired Output

I have tried pd.DataFrame(form_records(df$$"fields"$$), and also tried pd.json_normalize(df$$"fields"$$) which both came to the same output. The returned dataframe has a dict in each cell which is not what I am trying to get. What I'd like is "id" as column name, and "value" as cell content for every new column created.

Eventually I tried after json_normalize to extract the "ids" to rename my columns, like so :

#extracting row 1
my_list = fields_values.iloc[:1].values.flatten().tolist()

# getting all values under the "id" key 
ids = [d['id'] for d in my_list]

#extracting all current column names 
col_names = fields_values.columns.tolist()

#zipping the extract "ids" with the current col names 
new_col_names = dict(zip(col_names, ids))

#renaming the columns with the ids 
fields_values.rename(columns=new_col_names, inplace=True)

This eventually gave me the desired col names, but I still have the cell content with the dicts instead of only the value under "value" ... So I'm not sure if this is the best practice :

Correct col names but incorrect cell content:

correct col names but incorrect cell content

Does any one has any advice on this? Is there something I'm missing in the json_normalize function arguments, or shall I continue with extracting per row/col the dict values and replace in each cell (I find this to be such a pain!)? Isn't there a more efficient way to do this?

Flatten list of dict (array) as new columns not working with json normalize

Answers (1)

Related Questions