Krish1992
Krish1992

Reputation: 81

Create a dataframe from multiple list of dictionary values

I have a code as below,

safety_df ={}
for key3,safety in analy_df.items():
    safety = pd.DataFrame({"Year":safety['index'],
                      '{}'.format(key3)+"_CR":safety['CURRENT'],
                      '{}'.format(key3)+"_ICR":safety['ICR'],
                      '{}'.format(key3)+"_D/E":safety['D/E'],
                      '{}'.format(key3)+"_D/A":safety['D/A']})
    safety_df[key3] = safety 

Here in this code I'm extracting values from another dictionary. It will looping through the various companies that why I named using format in the key. The output contains above 5 columns for each company(Year,CR, ICR,D/E,D/A).

Output which is printing out is with plenty of NA values where after Here I want common column which is year for all companies and print following columns which is C1_CR, C2_CR, C3_CR, C1_ICR, C2_ICR, C3_ICR,...C3_D/A ..

I tried to extract using following code,

pd.concat(safety_df.values())

Sample output of this..

enter image description here

Here it extracts values for each list, but NA values are getting printed out because of for loops?

I also tried with groupby and it was not worked?..

How to set Year as common column, and print other values side by side.

Thanks

Upvotes: 0

Views: 736

Answers (1)

9769953
9769953

Reputation: 12221

Use axis=1 to concate along the columns:

import numpy as np
import pandas as pd

years = np.arange(2010, 2021)
n = len(years)
c1 = np.random.rand(n)
c2 = np.random.rand(n)
c3 = np.random.rand(n)

frames = {
    'a': pd.DataFrame({'year': years, 'c1': c1}),
    'b': pd.DataFrame({'year': years, 'c2': c2}),
    'c': pd.DataFrame({'year': years[1:], 'c3': c3[1:]}),
}
for key in frames:
    frames[key].set_index('year', inplace=True)

df = pd.concat(frames.values(), axis=1)
print(df)

which results in

            c1        c2        c3
year
2010  0.956494  0.667499       NaN
2011  0.945344  0.578535  0.780039
2012  0.262117  0.080678  0.084415
2013  0.458592  0.390832  0.310181
2014  0.094028  0.843971  0.886331
2015  0.774905  0.192438  0.883722
2016  0.254918  0.095353  0.774190
2017  0.724667  0.397913  0.650906
2018  0.277498  0.531180  0.091791
2019  0.238076  0.917023  0.387511
2020  0.677015  0.159720  0.063264

Note that I have explicitly set the index to be the 'year' column, and in my example, I have removed the first year from the 'c' column. This is to show how the indices of the different dataframes are matched when concatenating. Had the index been left to its standard value, you would have gotten the years out of sync and a NaN value at the bottom of column 'c' instead.

Upvotes: 1

Related Questions