Reputation: 25
I am trying to split my rows into multiple rows based on if a value exists in a column. I am able to filter based on the column value but would like to split the row into two based on the condition and return only specific parts of the row. The dataframe consists of different types of payments and the activity field represents the payment type that occurred. I would like for when there is a specific payment type to split the row into two. In the example below, when payment 2 exists I want it to be a separate transaction.
I have the following dataframe:
ID Payment 1 Payment 2 Payment 3 Payment 4 Payment 5 Activity
1 10 0 0 0 0 Payment 1
2 0 10 15 0 0 Payment 2, Payment 3
3 10 20 30 0 0 Payment 1, Payment 2, Payment 3
But I would like for the dataframe to be:
ID Payment 1 Payment 2 Payment 3 Payment 4 Payment 5 Activity
1 10 0 0 0 0 Payment 1
2 0 10 0 0 0 Payment 2
2 0 0 15 0 0 Payment 3
3 10 0 30 0 0 Payment 1, Payment 3
3 0 20 0 0 0 Payment 2
Upvotes: 0
Views: 1248
Reputation: 2803
The problem here is that you want to change both the Payment
columns and the Activity
column at the same time. So actually, I only see the naive approach of iterating through the DataFrame and writing new rows to a new DataFrame depending on the condition row['Payment 2'] == 0
like this:
df_new = pd.DataFrame(columns=['ID', 'Payment 1', 'Payment 2', 'Payment 3', 'Payment 4', 'Payment 5', 'Activity'])
for index, row in df.iterrows():
if row['Payment 2'] == 0:
df_new = df_new.append(row)
else:
row_new = pd.DataFrame({'ID':[row['ID']], 'Payment 1':[row['Payment 1']], 'Payment 2':[0], 'Payment 3':[row['Payment 3']], 'Payment 4':[row['Payment 4']], 'Payment 5':[row['Payment 5']], 'Activity':[', '.join([x for x in ['Payment 1', 'Payment 3', 'Payment 4', 'Payment 5'] if row[x] != 0])]})
df_new = df_new.append(row_new)
row_new = pd.DataFrame({'ID':[row['ID']], 'Payment 1':[0], 'Payment 2':[row['Payment 2']], 'Payment 3':[0], 'Payment 4':[0], 'Payment 5':[0], 'Activity':'Payment 2'})
df_new = df_new.append(row_new)
This will append two new rows (one with Payment 2
only and one with the rest) if there is a Payment 2
and append the original row from the source DataFrame if there is no Payment 2
.
Related questions on splitting rows (based on one column only and leaving the other ones untouched):
Upvotes: 1