Reputation: 189
I have been trying to solve the problem of grouping my data (by Loan Number) and then returning the one row per group based on either the highest or lowest value of a certain column (here it is the Filter column) based on differing conditions. I realize I cannot use ifelse to do what I want but other examples have used if and else (hence my attempt). I have had all manner of errors along the way. Any help would be appreciated along with clarifying the problems
Example data
Loan_Number <- c(100,100,100,100,200,200,200,200,300,300,300,300)
Principal_Remaining <- c(50,50,50,50,5,5,0,0,10,10,10,10)
Principal_In_Arrears <- c(50,50,50,50,0,0,0,0,0,10,10,10)
Write_off_Number <- c(10,10,10,10, 0,0,0,0,0,0,0,0)
Filter <- c (1,2,3,4,5,6,7,8,9,10,11,12)
outcome <- as.data.frame(cbind(Loan_Number,Principal_In_Arrears, Principal_Remaining, Write_off_Number, Filter))
My last attempt at the code was
hope <- outcome %>%
group_by(Loan_Number) %>%
if(Principal_Remaining == 0) top_n( -1, wt = Filter) else
if(Principal_In_Arrears == 0) top_n( -1, wt = Filter) else
if(Write_off_Number >= 0) top_n( 1, wt = Filter) else top_n( -1, wt =
Filter)))
The idea being that if there is no principal left then I want a certain value and if there is principal left I have to check whether the loan is in arrears or has been written off.
NB To confirm the exact requirement, I need to avoid considering the rows that do not meet the condition. For example, for loan 200 the record returned should return row 7 (the lowest month where the principal is 0). The first answer did not do that. Also, loan 300 should return row 10 (condition should be != 0 and the min of Filter), the first month it goes into arrears. Loan 1 should just return row 1.
Upvotes: 0
Views: 155
Reputation: 189
at this point this works but I am not 100% it will continue to work as other combinations develop
hope <- outcome %>%
group_by(Banjo_Loan_No) %>%
dplyr::slice(case_when(any(Principal_Remaining == 0) ~ which.min(abs(filter*Principal_Remaining)),
any(Principal_in_Arrears == 0) ~ which.max(abs(filter*Principal_in_Arrears > 0)),
any(Write_Off_Date != "1016-01-01") ~ which.max(filter),
TRUE ~ which.min(filter)))
Upvotes: 0
Reputation: 388807
You can use case_when
with slice
to select the row for each Loan_Number
.
library(dplyr)
outcome %>%
group_by(Loan_Number) %>%
slice(case_when(any(Principal_Remaining == 0) ~ which.max(Filter),
any(Principal_In_Arrears == 0) ~ which.min(Filter),
any(Write_off_Number >= 0) ~ which.max(Filter),
TRUE ~ which.min(Filter))) %>%
ungroup
# Loan_Number Principal_In_Arrears Principal_Remaining Write_off_Number Filter
# <dbl> <dbl> <dbl> <dbl> <dbl>
#1 100 50 50 10 4
#2 200 0 0 0 8
#3 300 10 10 0 12
Upvotes: 1