How to merge a variable number of dataframes on the same index

Question

I have multiple (3 or more) dataframes which I need to merge.

Example df1:

           | clicks_US | dayofyear | weekday
2020-03-15 | 15000     | 75        | Sunday
2020-03-16 | 12000     | 76        | Monday
2020-03-17 | 10000     | 77        | Tuesday

Example df2:

           | clicks_UK | dayofyear | weekday
2020-03-15 | 13000     | 75        | Sunday
2020-03-16 | 9000      | 76        | Monday
2020-03-17 | 8000      | 77        | Tuesday

Example df3:

           | clicks_NZ | dayofyear | weekday
2020-03-15 | 7000      | 75        | Sunday
2020-03-16 | 5000      | 76        | Monday
2020-03-17 | 1000      | 77        | Tuesday

Desired output:

           | clicks_US | clicks_UK |clicks_NZ | dayofyear | weekday
2020-03-15 | 15000     | 13000     | 7000     | 75        | Sunday
2020-03-16 | 12000     | 9000      | 5000     | 76        | Monday
2020-03-17 | 10000     | 8000      | 1000     | 77        | Tuesday

But the number of dfs to merge can be more sometimes.

Column I want to merge is the index one, datetime with ISO 8601 format.

Because I have a varying number of dfs I need to merge each time I searched for a flexible method but didn't find oney yet.

Is there an easy method to define a list with the different dfs and just call

dfs = [df1, df2, df3, df4]
pd.merge(dfs, how="inner")

without having to chain for each df so that I can keep the number flexible?

NYC Coder · Accepted Answer

You can do it in 2 steps:

dfs = [df1, df2, df3]
df = pd.concat(dfs, axis=1)
df = df.loc[:,~df.columns.duplicated()]
print(df)

            clicks_US  dayofyear  weekday  clicks_UK  clicks_NZ
Date
2020-03-15      15000         75   Sunday      13000       7000
2020-03-16      12000         76   Monday       9000       5000
2020-03-17      10000         77  Tuesday       8000       1000

How to merge a variable number of dataframes on the same index

Answers (2)

Related Questions