Reputation: 63
I am attempting to filter the following data frame:
calendar <- data.frame(Account_Num,Account_Num_Spec,Billing_Num,Transaction_Amt2,color,CAL_DATE2)
In the following manner:
For all rows that have the same contents in the "Account_Num_Spec", "Billing_Num" and "color" - filter them into one row that has the sum of "Transaction_Amt2" and the first oldest date in "CAL_DATE2".
For example
Ac_Num AC_Num_Spec Bill_Num Amount Color Date
1 1162 1162-001 620919 10 red 10/12/19
2 1179 1179-001 790719 10 red 11/12/19
3 1172 1172-001 721119 10 red 12/12/19
4 1162 1162-001 620919 -5 red 09/12/19
5 1162 1162-001 621019 10 red 11/12/19
6 1179 1179-001 791119 10 green 11/12/19
7 1162 1162-002 620919 10 red 12/12/19
8 1179 1179-001 791079 10 green 10/12/19
goes to:
Ac_Num AC_Num_Spec Bill_Num Amount Color Date
1 1162 1162-001 620919 5 red 09/12/19
2 1179 1179-001 790719 10 red 11/12/19
3 1179 1179-001 791119 10 red 12/12/19
4 1172 1172-001 721119 10 red 12/12/19
5 1162 1162-001 621019 10 red 11/12/19
6 1179 1179-001 791119 10 green 11/12/19
7 1162 1162-002 620919 10 red 12/12/19
8 1179 1179-001 791079 10 green 10/12/19
Upvotes: 1
Views: 27
Reputation: 73
I believe using group_by
and summarize
within the dplyr
package should work for what you are asking
calendar %>%
group_by(Account_Num,Account_Num_Spec,Billing_Num, color) %>%
summarize(Amount = sum(Amount),
Date = max(Date))
I know you didn't specify that Account_Num
had to be matching, but from your example that looked to be the case so I included it within the group_by
function.
Upvotes: 1