How to properly assign values to dataframe from strings?

Question

I have the following data sample:

{"rates":{
   "IT":{
     "country_name":"Italy",
     "standard_rate":20,
     "reduced_rates":{
       "food":13,
       "books":11
     }
  },

   "UK":{
     "country_name":"United Kingdom",
     "standard_rate":21,
     "reduced_rates":{
       "food":12,
       "books":1
     }
  }  
}}

The IT , UK are countries code and they can be changed. Every time I sample the data there might different key. There isn't a constant key name that I can relay on.

I have the following code that creates the dataframe:

df = pd.DataFrame(columns=['code', 'country_name')
for k,item in dic['rates'].items():
    df = df.append( {'code': k, 'country_name': item['country_name']} , ignore_index=True)

This gives me:

  code    country_name
0  IT       Italy
1  UK       United Kingdom

Now, while this works the docs https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.append.html report that this is inefficient usage.

The docs suggest to use:

pd.concat([pd.DataFrame([i], columns=['A']) for i in range(5)], ignore_index=True)

So I tried to do:

new = pd.concat([pd.DataFrame([item], columns=['code', 'country_name']) for k,item in dic['rates'].items()], ignore_index=True)

However this gives:

   code  country_name
0  NaN     Italy
1  NaN     United Kigdom

I understand that this happens since there is no actual key in the sample that called code this is just a name that I assigned to the column in the dataframe but I don't know how to fix this.

Suggestions?

Rakesh · Accepted Answer

Using a list comprehension

Ex:

import pandas as pd

dic = {"rates":{
   "IT":{
     "country_name":"Italy",
     "standard_rate":20,
     "reduced_rates":{
       "food":13,
       "books":11
     }
  },

   "UK":{
     "country_name":"United Kingdom",
     "standard_rate":21,
     "reduced_rates":{
       "food":12,
       "books":1
     }
  }  
}}

df = pd.DataFrame([{'code': k, 'country_name': v["country_name"]} for k,v in dic["rates"].items()])
print(df)

Output:

  code    country_name
0   IT           Italy
1   UK  United Kingdom

How to properly assign values to dataframe from strings?

Answers (2)

Related Questions