pythonRcpp
pythonRcpp

Reputation: 2146

Append rows to dataframe, add new columns if not exist

I have a df like below which

>>df
  group  sub_group    max
0     A          1   30.0
1     B          1  300.0
2     B          2    3.0
3     A          2    2.0

I need to have group and sub_group as atrributes (columns) and max as row So I do

>>> newdf.set_index(['group','sub_group']).T
group         A      B         A
sub_group     1      1    2    2
max        30.0  300.0  3.0  2.0

This gives me my intended formatting

Now I need to merge it to another similar dataframe say

>>df2
  group  sub_group    max
0     C          1   3000.0
1     A          1   4000.0

Such that my merge results in

group         A      B         A     C
sub_group     1      1    2    2     1
max        30.0   300.0  3.0  2.0    NaN
max      4000.0    NaN   NaN  NaN    3000.0

Basically at every new df we are placing values under appropriate heading, if there is a new group or subgroup we add it the larger df. I am not sure if my way of transposing and then trying to merge append is a good approach

Since these df are generated in loop (loop items being dates), I would like to get a way to replace max printed in 1st column(of expected op) by loop date.

dates=['20170525', '20170623', '20170726']

for date in dates:
     df = pd.read_csv()

Upvotes: 2

Views: 4758

Answers (1)

jezrael
jezrael

Reputation: 862851

I think you can add parameter index_col to read_csv first for Multiindex from first and second column:

dfs = []
for date in dates:
     df = pd.read_csv('name', index_col=[0,1]) 
     dfs.append(df)

#another test df was added
print (df3)
                    max
group sub_group        
D     1          3000.0
E     1          4000.0

Then concat them together with parameter keys by list, then reshape by unstack and transpose:

#dfs = [df,df2,df3]
dates=['20170525', '20170623', '20170726']
df = pd.concat(dfs, keys=dates)['max'].unstack(0).T
print (df)
group           A           B            C       D       E
sub_group       1    2      1    2       1       1       1
20170525     30.0  2.0  300.0  3.0     NaN     NaN     NaN
20170623   4000.0  NaN    NaN  NaN  3000.0     NaN     NaN
20170726      NaN  NaN    NaN  NaN     NaN  3000.0  4000.0

Upvotes: 1

Related Questions