John Wendeborn
John Wendeborn

Reputation: 93

Combine multiple Pandas series with identical column names, but different indices

I have many pandas series structured more or less as follows.

s1                  s2                  s3                 s4
Date    val1        Date    val1        Date   val2        Date    val2
Jan     10          Apr     25          Jan    14          Apr     11
Feb     11          May     18          Feb    17          May     7
Mar     8           Jun     15          Mar    16          Jun     21

I would like to combine these series into a single data frame, with structure as follows:

Date    val1    val2
Jan     10      14
Feb     11      17
Mar     8       16
Apr     25      11
May     18      7
Jun     15      21

In an attempt to combine them, I have tried using pd.concat to create this single data frame. However, I have not been able to do so. The results of pd.concat(series, axis=1) (where series is a list [s1,s2,s3,s4]) is:

Date    val1    val1    val2    val2
Jan     10      nan     14      nan
Feb     11      nan     17      nan
Mar     8       nan     16      nan
Apr     nan     25      nan     11
May     nan     18      nan     7
Jun     nan     15      nan     21

And pd.concat(series, axis=0) simply creates a single series, ignoring the column names.

Is there a parameter in concat that will yield my desired result? Or is there some other function that can collapse the incorrect, nan-filled data frame into a frame with non-repeated columns and no nans?

Upvotes: 4

Views: 470

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150825

One way to do is groupby Date and choose first:

(pd.concat( [s1,s2,s3,s4])
   .groupby('Date', as_index=False, sort=False).first()
)

Output:

  Date  val1  val2
0  Jan    10    14
1  Feb    11    17
2  Mar     8    16
3  Apr    25    11
4  May    18     7
5  Jun    15    21

Upvotes: 2

Related Questions