JuniorESE
JuniorESE

Reputation: 267

concat two dataframes conditionally

I have two dataframes and this is one of them, it contains last year's data:

                  col1  col2  col3  col4
Type  Date
type1 2020-01-01   1.0   2.0   3.0   4.0
      2020-02-01   1.0   2.0   3.0   4.0
      2020-03-01   1.0   2.0   3.0   4.0
      2020-04-01   1.0   2.0   3.0   4.0
      2020-05-01   1.0   2.0   3.0   4.0
type2 2020-01-01   1.0   2.0   3.0   4.0
      2020-02-01   1.0   2.0   3.0   4.0
      2020-03-01   1.0   2.0   3.0   4.0
      2020-04-01   1.0   2.0   3.0   4.0
      2020-05-01   1.0   2.0   3.0   4.0
type3 2020-01-01   1.0   2.0   3.0   4.0
      2020-02-01   1.0   2.0   3.0   4.0
      2020-03-01   1.0   2.0   3.0   4.0
      2020-04-01   1.0   2.0   3.0   4.0
      2020-05-01   1.0   2.0   3.0   4.0
type4 2020-01-01   1.0   2.0   3.0   4.0
      2020-02-01   1.0   2.0   3.0   4.0
      2020-03-01   1.0   2.0   3.0   4.0
      2020-04-01   1.0   2.0   3.0   4.0
      2020-05-01   1.0   2.0   3.0   4.0
type5 2020-01-01   1.0   2.0   3.0   4.0
      2020-02-01   1.0   2.0   3.0   4.0
      2020-03-01   1.0   2.0   3.0   4.0
      2020-04-01   1.0   2.0   3.0   4.0
      2020-05-01   1.0   2.0   3.0   4.0

The other one is new data:

                  col1  col2  col3  col4
Type  Date
type1 2021-01-01  10.0  20.0  30.0  40.0
      2021-02-01   0.0   0.0   0.0   0.0
      2021-03-01   0.0   0.0   0.0   0.0
      2021-04-01   0.0   0.0   0.0   0.0
      2021-05-01   0.0   0.0   0.0   0.0
type2 2021-01-01  10.0  20.0  30.0  40.0
      2021-02-01   0.0   0.0   0.0   0.0
      2021-03-01   0.0   0.0   0.0   0.0
      2021-04-01   0.0   0.0   0.0   0.0
      2021-05-01   0.0   0.0   0.0   0.0
type3 2021-01-01  10.0  20.0  30.0  40.0
      2021-02-01   0.0   0.0   0.0   0.0
      2021-03-01   0.0   0.0   0.0   0.0
      2021-04-01   0.0   0.0   0.0   0.0
      2021-05-01   0.0   0.0   0.0   0.0
type4 2021-01-01  10.0  20.0  30.0  40.0
      2021-02-01   0.0   0.0   0.0   0.0
      2021-03-01   0.0   0.0   0.0   0.0
      2021-04-01   0.0   0.0   0.0   0.0
      2021-05-01   0.0   0.0   0.0   0.0
type5 2021-01-01  10.0  20.0  30.0  40.0
      2021-02-01   0.0   0.0   0.0   0.0
      2021-03-01   0.0   0.0   0.0   0.0
      2021-04-01   0.0   0.0   0.0   0.0
      2021-05-01   0.0   0.0   0.0   0.0

As you see, the things that are different are just dates and values. I create this new data every day and I want to concatenate this two dataframes in some conditions. I would like to see last year's values ​​here for comparison. For example, Since we are currently in January 2021, I just want to add January 2020 data here. When the date is February, this dataframe should no longer contain January 2020 data, only February 2020 data. However, I am adding the desired output here:

When the date is January 2021:

                  col1  col2  col3  col4
Type  Date
type1 2020-01-01   1.0   2.0   3.0   4.0
      2021-01-01  10.0  20.0  30.0  40.0
      2021-02-01   0.0   0.0   0.0   0.0
      2021-03-01   0.0   0.0   0.0   0.0
      2021-04-01   0.0   0.0   0.0   0.0
      2021-05-01   0.0   0.0   0.0   0.0
type2 2020-01-01   1.0   2.0   3.0   4.0
      2021-01-01  10.0  20.0  30.0  40.0
      2021-02-01   0.0   0.0   0.0   0.0
      2021-03-01   0.0   0.0   0.0   0.0
      2021-04-01   0.0   0.0   0.0   0.0
      2021-05-01   0.0   0.0   0.0   0.0
type3 2020-01-01   1.0   2.0   3.0   4.0
      2021-01-01  10.0  20.0  30.0  40.0
      2021-02-01   0.0   0.0   0.0   0.0
      2021-03-01   0.0   0.0   0.0   0.0
      2021-04-01   0.0   0.0   0.0   0.0
      2021-05-01   0.0   0.0   0.0   0.0
type4 2020-01-01   1.0   2.0   3.0   4.0
      2021-01-01  10.0  20.0  30.0  40.0
      2021-02-01   0.0   0.0   0.0   0.0
      2021-03-01   0.0   0.0   0.0   0.0
      2021-04-01   0.0   0.0   0.0   0.0
      2021-05-01   0.0   0.0   0.0   0.0
type5 2020-01-01   1.0   2.0   3.0   4.0
      2021-01-01  10.0  20.0  30.0  40.0
      2021-02-01   0.0   0.0   0.0   0.0
      2021-03-01   0.0   0.0   0.0   0.0
      2021-04-01   0.0   0.0   0.0   0.0
      2021-05-01   0.0   0.0   0.0   0.0

What is the way to do this?

Upvotes: 1

Views: 43

Answers (1)

jezrael
jezrael

Reputation: 862511

Idea is convert actual time to month periods by Timestamp.to_period, subtract 12 for previous year and by filter second level Date selected by Index.get_level_values and converted to periods by DatetimeIndex.to_period :

now = pd.to_datetime('now').to_period('m') - 12

df11 = df1[ df1.index.get_level_values('Date').to_period('m') == now]
print (df11)
                  col1  col2  col3  col4
Type  Date                              
type1 2020-01-01   1.0   2.0   3.0   4.0
type2 2020-01-01   1.0   2.0   3.0   4.0
type3 2020-01-01   1.0   2.0   3.0   4.0
type4 2020-01-01   1.0   2.0   3.0   4.0
type5 2020-01-01   1.0   2.0   3.0   4.0

Last add to second DataFrame by concat and sorting MultiIndex:

df3 = pd.concat([df11, df2]).sort_index()
print (df3)
                  col1  col2  col3  col4
Type  Date                              
type1 2020-01-01   1.0   2.0   3.0   4.0
      2021-01-01  10.0  20.0  30.0  40.0
      2021-02-01   0.0   0.0   0.0   0.0
      2021-03-01   0.0   0.0   0.0   0.0
      2021-04-01   0.0   0.0   0.0   0.0
      2021-05-01   0.0   0.0   0.0   0.0
type2 2020-01-01   1.0   2.0   3.0   4.0
      2021-01-01  10.0  20.0  30.0  40.0
      2021-02-01   0.0   0.0   0.0   0.0
      2021-03-01   0.0   0.0   0.0   0.0
      2021-04-01   0.0   0.0   0.0   0.0
      2021-05-01   0.0   0.0   0.0   0.0
type3 2020-01-01   1.0   2.0   3.0   4.0
      2021-01-01  10.0  20.0  30.0  40.0
      2021-02-01   0.0   0.0   0.0   0.0
      2021-03-01   0.0   0.0   0.0   0.0
      2021-04-01   0.0   0.0   0.0   0.0
      2021-05-01   0.0   0.0   0.0   0.0
type4 2020-01-01   1.0   2.0   3.0   4.0
      2021-01-01  10.0  20.0  30.0  40.0
      2021-02-01   0.0   0.0   0.0   0.0
      2021-03-01   0.0   0.0   0.0   0.0
      2021-04-01   0.0   0.0   0.0   0.0
      2021-05-01   0.0   0.0   0.0   0.0
type5 2020-01-01   1.0   2.0   3.0   4.0
      2021-01-01  10.0  20.0  30.0  40.0
      2021-02-01   0.0   0.0   0.0   0.0
      2021-03-01   0.0   0.0   0.0   0.0
      2021-04-01   0.0   0.0   0.0   0.0
      2021-05-01   0.0   0.0   0.0   0.0

Upvotes: 1

Related Questions