Reputation: 145
I have the following dataframe (please refer to the dictionary below if you have problems with the tables):
account_id contract_id year_activation month_activated date_activated state term_months contract_type renewal_type
0 1 A 2020 December 2020-12-01 CA 6 Original Regular
1 1 B 2021 February 2021-02-13 CA 12 Upgrade Early
2 1 C 2021 February 2021-04-06 CA 12 Upgrade Early
3 1 I 2021 October 2021-10-23 NY 6 Winback Regular
4 1 N 2021 November 2021-11-11 NY 6 Renewal Early
5 2 K 2021 January 2021-01-01 IL 12 Original Regular
6 2 F 2021 March 2021-03-23 IL 6 Renewal Early
I want the output as follows:
account_id contract_id orig_year_activation orig_month_activated orig_date_activated orig_state year_activation month_activated date_activated state term_months contract_type renewal_type
0 1 A 2020 December 2020-12-01 CA 2020 December 2020-12-01 CA 6 Original Regular
1 1 B 2020 December 2020-12-01 CA 2021 February 2021-02-13 CA 12 Upgrade Early
2 1 C 2020 December 2020-12-01 CA 2021 February 2021-04-06 CA 12 Upgrade Early
3 1 I 2021 October 2021-10-23 NY 2021 October 2021-10-23 NY 6 Winback Regular
4 1 N 2021 October 2021-10-23 NY 2021 November 2021-11-11 NY 6 Renewal Early
5 2 K 2021 January 2021-01-01 IL 2021 January 2021-01-01 IL 12 Original Regular
6 2 F 2021 January 2021-01-01 IL 2021 March 2021-03-23 IL 6 Renewal Early
An account can have multiple deals. If an earlier contract has a contract_type of "Original", the succeeding contract will copy the details of that contract (year_activation, month_activated, date_activated, and state). If there is an earlier deal that has a contract_type of "Winback", the succeeding contracts will return the data from this contract under the columns "orig_year_activation", "orig_month_activated", "orig_date_activated", and "orig_state".
This is the dictionary for the dataframe:
{'account_id': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 2, 6: 2},
'contract_id': {0: 'A', 1: 'B', 2: 'C', 3: 'I', 4: 'N', 5: 'K', 6: 'F'},
'year_activation': {0: 2020,
1: 2021,
2: 2021,
3: 2021,
4: 2021,
5: 2021,
6: 2021},
'month_activated': {0: 'December',
1: 'February',
2: 'February',
3: 'October',
4: 'November',
5: 'January',
6: 'March'},
'date_activated': {0: Timestamp('2020-12-01 00:00:00'),
1: Timestamp('2021-02-13 00:00:00'),
2: Timestamp('2021-04-06 00:00:00'),
3: Timestamp('2021-10-23 00:00:00'),
4: Timestamp('2021-11-11 00:00:00'),
5: Timestamp('2021-01-01 00:00:00'),
6: Timestamp('2021-03-23 00:00:00')},
'state': {0: 'CA', 1: 'CA', 2: 'CA', 3: 'NY', 4: 'NY', 5: 'IL', 6: 'IL'},
'term_months': {0: 6, 1: 12, 2: 12, 3: 6, 4: 6, 5: 12, 6: 6},
'contract_type': {0: 'Original',
1: 'Upgrade',
2: 'Upgrade',
3: 'Winback',
4: 'Renewal',
5: 'Original',
6: 'Renewal'},
'renewal_type': {0: 'Regular',
1: 'Early',
2: 'Early',
3: 'Regular',
4: 'Early',
5: 'Regular',
6: 'Early'}}
This is the dictionary for the result:
{'account_id': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 2, 6: 2},
'contract_id': {0: 'A', 1: 'B', 2: 'C', 3: 'I', 4: 'N', 5: 'K', 6: 'F'},
'orig_year_activation': {0: 2020,
1: 2020,
2: 2020,
3: 2021,
4: 2021,
5: 2021,
6: 2021},
'orig_month_activated': {0: 'December',
1: 'December',
2: 'December',
3: 'October',
4: 'October',
5: 'January',
6: 'January'},
'orig_date_activated': {0: Timestamp('2020-12-01 00:00:00'),
1: Timestamp('2020-12-01 00:00:00'),
2: Timestamp('2020-12-01 00:00:00'),
3: Timestamp('2021-10-23 00:00:00'),
4: Timestamp('2021-10-23 00:00:00'),
5: Timestamp('2021-01-01 00:00:00'),
6: Timestamp('2021-01-01 00:00:00')},
'orig_state': {0: 'CA', 1: 'CA', 2: 'CA', 3: 'NY', 4: 'NY', 5: 'IL', 6: 'IL'},
'year_activation': {0: 2020,
1: 2021,
2: 2021,
3: 2021,
4: 2021,
5: 2021,
6: 2021},
'month_activated': {0: 'December',
1: 'February',
2: 'February',
3: 'October',
4: 'November',
5: 'January',
6: 'March'},
'date_activated': {0: Timestamp('2020-12-01 00:00:00'),
1: Timestamp('2021-02-13 00:00:00'),
2: Timestamp('2021-04-06 00:00:00'),
3: Timestamp('2021-10-23 00:00:00'),
4: Timestamp('2021-11-11 00:00:00'),
5: Timestamp('2021-01-01 00:00:00'),
6: Timestamp('2021-03-23 00:00:00')},
'state': {0: 'CA', 1: 'CA', 2: 'CA', 3: 'NY', 4: 'NY', 5: 'IL', 6: 'IL'},
'term_months': {0: 6, 1: 12, 2: 12, 3: 6, 4: 6, 5: 12, 6: 6},
'contract_type': {0: 'Original',
1: 'Upgrade',
2: 'Upgrade',
3: 'Winback',
4: 'Renewal',
5: 'Original',
6: 'Renewal'},
'renewal_type': {0: 'Regular',
1: 'Early',
2: 'Early',
3: 'Regular',
4: 'Early',
5: 'Regular',
6: 'Early'}}
Upvotes: 0
Views: 68
Reputation: 35646
import pandas as pd
# Full DataFrame Code Below
df = pd.DataFrame({...})
# Where contract type is Original or Winback set equal to value
mask = (df['contract_type'].eq('Original') | df['contract_type'].eq('Winback'))
# Add Columns At Row Based on Mask and assign values based on existing columns
df.loc[mask, ['orig_year_activation',
'orig_month_activated',
'orig_date_activated',
'orig_state']] = df.loc[mask, ['year_activation',
'month_activated',
'date_activated',
'state']].values
# Ensure Correct Types
df = df.convert_dtypes()
# Fill The NaNs
df = df.ffill()
# Re Order Columns
df = df[['account_id', 'contract_id',
'orig_year_activation', 'orig_month_activated',
'orig_date_activated', 'orig_state', 'year_activation',
'month_activated', 'date_activated', 'state',
'term_months', 'contract_type', 'renewal_type']]
# Output
print(df.to_string())
Output:
account_id contract_id orig_year_activation orig_month_activated orig_date_activated orig_state year_activation month_activated date_activated state term_months contract_type renewal_type 0 1 A 2020 December 2020-12-01 CA 2020 December 2020-12-01 CA 6 Original Regular 1 1 B 2020 December 2020-12-01 CA 2021 February 2021-02-13 CA 12 Upgrade Early 2 1 C 2020 December 2020-12-01 CA 2021 February 2021-04-06 CA 12 Upgrade Early 3 1 I 2021 October 2021-10-23 NY 2021 October 2021-10-23 NY 6 Winback Regular 4 1 N 2021 October 2021-10-23 NY 2021 November 2021-11-11 NY 6 Renewal Early 5 2 K 2021 January 2021-01-01 IL 2021 January 2021-01-01 IL 12 Original Regular 6 2 F 2021 January 2021-01-01 IL 2021 March 2021-03-23 IL 6 Renewal Early
Full DataFrame Construction Code:
df = pd.DataFrame({'account_id': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 2, 6: 2},
'contract_id': {0: 'A', 1: 'B', 2: 'C', 3: 'I', 4: 'N',
5: 'K', 6: 'F'},
'year_activation': {0: 2020, 1: 2021, 2: 2021, 3: 2021,
4: 2021, 5: 2021, 6: 2021},
'month_activated': {0: 'December', 1: 'February',
2: 'February', 3: 'October',
4: 'November', 5: 'January',
6: 'March'},
'date_activated': {0: pd.to_datetime('2020-12-01 00:00:00'),
1: pd.to_datetime('2021-02-13 00:00:00'),
2: pd.to_datetime('2021-04-06 00:00:00'),
3: pd.to_datetime('2021-10-23 00:00:00'),
4: pd.to_datetime('2021-11-11 00:00:00'),
5: pd.to_datetime('2021-01-01 00:00:00'),
6: pd.to_datetime('2021-03-23 00:00:00')},
'state': {0: 'CA', 1: 'CA', 2: 'CA', 3: 'NY',
4: 'NY', 5: 'IL', 6: 'IL'},
'term_months': {0: 6, 1: 12, 2: 12, 3: 6,
4: 6, 5: 12, 6: 6},
'contract_type': {0: 'Original', 1: 'Upgrade',
2: 'Upgrade', 3: 'Winback',
4: 'Renewal', 5: 'Original',
6: 'Renewal'},
'renewal_type': {0: 'Regular', 1: 'Early',
2: 'Early', 3: 'Regular',
4: 'Early', 5: 'Regular',
6: 'Early'}})
Upvotes: 2