orome
orome

Reputation: 48466

Converting a list of dicts to a Pandas dataframe

I have a list of Python dicts each with the same keys,

dict_keys= ['k1','k2','k3','k4','k5','k6'] # More like 30 keys in practice
data = []
for i in range(20): # More like 3000 in practice
    data.append({k: np.random.randint(100) for k in dict_keys}) 

and would like to use it to create a corresponding Pandas dataframe with a subset of the keys. My current approach is to take each dict from the list one at a time and append it to the dataframe using

df = pd.DataFrame(columns=['k1','k2','k5','k6'])
for d in data:
    df = df.append({k: d[k] for k in list(df.columns)}, ignore_index=True)
    # In practice, there are some calculations on some of the values here

but this is very slow (the actual list, and the dicts it contains, are both quite large).

Is there a better, faster (and more idiomatic) method for iterating through a list of dictionaries and adding them as rows to a Pandas dataframe?

Upvotes: 8

Views: 5314

Answers (1)

shx2
shx2

Reputation: 64318

Simply pass data to DataFrame's __init__, or to DataFrame.from_records (either would work).

You might also want to set an index, e.g. DataFrame.from_records(data, index = 'k1').

If you need to also perform some calculations, it's usually easier and more convenient to do it on the DataFrame, after creating it. Leverage pandas!

Upvotes: 15

Related Questions