Adding a subindex to merged dataframes

Question

I have 3 dataframes each with the same columns (years) and same indexes (countries).

Now I want to merge these 3 dataframes. But since all have the same columns it is appending those. So 'd like to keep the country index and add a subindex for each dataframe because all represent different numbers for each year.

#dataframe 1
    #CO2:
                           2005     2010      2015      2020
country
Afghanistan              169405   210161    259855    319447
Albania                     762      940      1154      1408
Algeria                  158336   215865    294768    400126

#dataframe 2
#Arrivals + Departures:
                            2005      2010      2015      2020
country
Afghanistan               977896   1326120   1794547   2414943
Albania                   103132    154219    224308    319440
Algeria                  3775374   5307448   7389427  10159656

#data frame 3
#Travel distance in km:
                                2005          2010          2015           2020
country
Afghanistan               9330447004   12529259781   16776152792    22337458954
Albania                     63159063      82810491     107799357      139543748
Algeria                  12254674181   17776784271   25782632480    37150057977

The result should be something like:

                                2005          2010          2015           2020
    country
    Afghanistan co2           169405        210161         259855         319447
                flights       977896        1326120       1794547        2414943
                traveldistance   9330447004    12529259781   16776152792    22337458954
    Albania ....

How can I do this? NOTE: The years are an input so these are not fixed. They could just be 2005,2010 for example. Thanks in advance.

M_S_N · Accepted Answer

I have tried to solve the problem using concat and groupby using your dataset hope it helps

First concat the 3 dfs

l=[df,df2,df3]

f=pd.concat(l,keys= ['CO2','Flights','traveldistance'],axis=0,).reset_index().rename(columns={'level_0':'Category'})

the use groupby to get the values

result_df=f.groupby(['country', 'Category'])[f.columns[2:]].first()

Hope it helps and solve your problem

Output looks like this

Adding a subindex to merged dataframes

Answers (1)

Related Questions