complog
complog

Reputation: 97

Dataframes combining with multiple columns as one column

I have the following 2 dataframes with the same structure (Same row names and column names) but different values:

DF1

date A B C D
2005-01-01 5 10 15 20
2005-01-02 1 2 3 4
2005-01-03 5 10 15 20
2005-01-04 5 10 15 20
... ... ... ... ...

DF2

date A B C D
2005-01-01 0,1 0,2 0,3 0,4
2005-01-02 0,2 0,4 0,6 0,8
2005-01-03 0,3 0,6 0,9 1
2005-01-04 0,4 0,5 0,6 0,7
... ... ... ... ...

My goal is to combine them as follows:

date Name DF1_value DF2_value
2005-01-01 A 5 0,1
2005-01-01 B 10 0,2
2005-01-01 C 15 0,3
2005-01-01 D 20 0,4
2005-01-02 A 1 0,2
2005-01-02 B 2 0,4
2005-01-02 C 3 0,6
2005-01-02 D 4 0,8
... ... ... ...

I tried to use melt() function for each dataframe and then concatenate (concat()) them but it creates unnecessary rows for the column "Name".

Any help would be most appreciated!

Upvotes: 1

Views: 271

Answers (1)

jezrael
jezrael

Reputation: 863531

Solution with DataFrame.melt for both DataFrames, then use concat only by DF2_value:

df11 = df1.melt('date', var_name='Name', value_name='DF1_value')
df22 = df2.melt('date', var_name='Name', value_name='DF2_value')

df = pd.concat([df11, df22['DF2_value']], axis=1)
print (df)
          date Name  DF1_value DF2_value
0   2005-01-01    A          5       0,1
1   2005-01-02    A          1       0,2
2   2005-01-03    A          5       0,3
3   2005-01-04    A          5       0,4
4   2005-01-01    B         10       0,2
5   2005-01-02    B          2       0,4
6   2005-01-03    B         10       0,6
7   2005-01-04    B         10       0,5
8   2005-01-01    C         15       0,3
9   2005-01-02    C          3       0,6
10  2005-01-03    C         15       0,9
11  2005-01-04    C         15       0,6
12  2005-01-01    D         20       0,4
13  2005-01-02    D          4       0,8
14  2005-01-03    D         20         1
15  2005-01-04    D         20       0,7

Or use DataFrame.set_index with DataFrame.stack for MultiIndex and join together:

df = (pd.concat([df1.set_index('date').stack().rename('DF1_value'),
                df2.set_index('date').stack().rename('DF1_value')], axis=1)
        .rename_axis(['date','Name'])
        .reset_index())
print (df)
          date Name DF1_value DF1_value
0   2005-01-01    A         5       0,1
1   2005-01-01    B        10       0,2
2   2005-01-01    C        15       0,3
3   2005-01-01    D        20       0,4
4   2005-01-02    A         1       0,2
5   2005-01-02    B         2       0,4
6   2005-01-02    C         3       0,6
7   2005-01-02    D         4       0,8
8   2005-01-03    A         5       0,3
9   2005-01-03    B        10       0,6
10  2005-01-03    C        15       0,9
11  2005-01-03    D        20         1
12  2005-01-04    A         5       0,4
13  2005-01-04    B        10       0,5
14  2005-01-04    C        15       0,6
15  2005-01-04    D        20       0,7

Upvotes: 3

Related Questions