Yam
Yam

Reputation: 145

Change date in columns based on actual data in other columns

I have the following dataframe:

 account_id contract_id date_activated  2021-12-01 00:00:00 2021-01-01 00:00:00 2021-02-01 00:00:00 2021-03-01 00:00:00 2021-04-01 00:00:00 2021-05-01 00:00:00 2021-06-01 00:00:00
0   1   A   2020-12-04  200.0   200.0   200.0   0.0 0.0 0.0 0.0
1   1   B   2021-03-09  0.0 0.0 300.0   300.0   300.0   300.0   300.0
2   1   C   2021-04-25  0.0 0.0 0.0 0.0 100.0   100.0   100.0

I want to change the date_activated column when it does not equal with the monthly payments data shown from the 4th column and so on (in terms of the month and year). The revised date_activated will be the same as the column header, i.e. it will be changed to the earliest payment date and can just contain the date representing the first day of the month). I need to retain only those that are the same with the month and the year.

I want an output as this:

 account_id contract_id date_activated  2021-12-01 00:00:00 2021-01-01 00:00:00 2021-02-01 00:00:00 2021-03-01 00:00:00 2021-04-01 00:00:00 2021-05-01 00:00:00 2021-06-01 00:00:00
0   1   A   2021-12-01  200.0   200.0   200.0   0.0 0.0 0.0 0.0
1   1   B   2021-02-01  0.0 0.0 300.0   300.0   300.0   300.0   300.0
2   1   C   2021-04-25  0.0 0.0 0.0 0.0 100.0   100.0   100.0

Here's the dictionary for the dataframe:

{'account_id': {0: 1, 1: 1, 2: 1},
 'contract_id': {0: 'A', 1: 'B', 2: 'C'},
 'date_activated': {0: Timestamp('2020-12-04 00:00:00'),
  1: Timestamp('2021-03-09 00:00:00'),
  2: Timestamp('2021-04-25 00:00:00')},
 datetime.datetime(2021, 12, 1, 0, 0): {0: 200.0, 1: 0.0, 2: 0.0},
 datetime.datetime(2021, 1, 1, 0, 0): {0: 200.0, 1: 0.0, 2: 0.0},
 datetime.datetime(2021, 2, 1, 0, 0): {0: 200.0, 1: 300.0, 2: 0.0},
 datetime.datetime(2021, 3, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 0.0},
 datetime.datetime(2021, 4, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0},
 datetime.datetime(2021, 5, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0},
 datetime.datetime(2021, 6, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0}}

Here's the dictionary for the desired output:

{'account_id': {0: 1, 1: 1, 2: 1},
 'contract_id': {0: 'A', 1: 'B', 2: 'C'},
 'date_activated': {0: Timestamp('2021-12-01 00:00:00'),
  1: Timestamp('2021-02-01 00:00:00'),
  2: Timestamp('2021-04-25 00:00:00')},
 datetime.datetime(2021, 12, 1, 0, 0): {0: 200.0, 1: 0.0, 2: 0.0},
 datetime.datetime(2021, 1, 1, 0, 0): {0: 200.0, 1: 0.0, 2: 0.0},
 datetime.datetime(2021, 2, 1, 0, 0): {0: 200.0, 1: 300.0, 2: 0.0},
 datetime.datetime(2021, 3, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 0.0},
 datetime.datetime(2021, 4, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0},
 datetime.datetime(2021, 5, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0},
 datetime.datetime(2021, 6, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0}}

Upvotes: 1

Views: 63

Answers (1)

Ynjxsjmh
Ynjxsjmh

Reputation: 29982

You can use pandas.Series.first_valid_index() to check the first index of not none value after replace 0 to np.nan in your datetime column.

Then use pandas.DataFrame.where() to replace values where the condition is False.

idx = df.iloc[:, 3:].replace(0, np.nan).T.apply(pd.Series.first_valid_index)

m = (df['date_activated'].dt.year == idx.dt.year) & (df['date_activated'].dt.month == idx.dt.month)

df['date_activated'] = df['date_activated'].where(m, idx)
print(df['date_activated'])

0   2021-12-01
1   2021-02-01
2   2021-04-25
Name: date_activated, dtype: datetime64[ns]

Upvotes: 2

Related Questions