Reputation: 525
UPRN Start.Date End.Date Disability
1 2006-12-20 17-NOV-17 Y
1 2006-12-20 17-NOV-17 N
2 1991-12-06 N
2 1991-12-06 N
3 1991-04-29 2015-04-21 N
3 2015-04-22 Y
4 2005-02-15 Y
4 2005-02-15 N
I have a dataset that looks something like the above (but much bigger). I would like to create a new column called Any_Disability.
The way I want to do this is to group by UPRN, Start.Date and End.Date and if any row in that group has a disability then both rows would have a "Y" for Any_Disability.
What I've tried already is:
library(dplyr)
test3<-all_data%>%
group_by(UPRN, Start.Date, End.Date)%>%
mutate(Any_Disability = ifelse(Disability=="Y", "Y","N"))
But this doesn't work as it gives the below answer:
UPRN Start.Date End.Date Disability Any_Disability
1 2006-12-20 17-NOV-17 Y Y
1 2006-12-20 17-NOV-17 N N
2 1991-12-06 N N
2 1991-12-06 N N
3 1991-04-29 2015-04-21 N N
3 2015-04-22 Y Y
4 2005-02-15 Y Y
4 2005-02-15 N N
Reproducible code:
UPRN<-c(1,1,2,2,3,3,4,4)
Start.Date<-c("2006-12-20","2006-12-20", "1991-12-06","1991-12-06","1991-04-29", "2015-04-22","2005-02-15", "2005-02-15")
End.Date<-c("17-NOV-17", "17-NOV-17", "","", "2015-04-21", "", "", "")
Disability<-c("Y","N","N","N","N","Y","Y","N")
dataset <- data.frame(UPRN, Start.Date, End.Date, Disability)
Upvotes: 0
Views: 96
Reputation: 1169
Base R, using ave
and 'Y' > 'N' ... TRUE
dataset_new <- within(dataset,
Any_Disability <- ave(Disability, paste0(UPRN, "#", Start.Date, "#", End.Date),
FUN = max)
)
# UPRN Start.Date End.Date Disability Any_Disability
# 1 2006-12-20 17-NOV-17 Y Y
# 1 2006-12-20 17-NOV-17 N Y
# 2 1991-12-06 N N
# 2 1991-12-06 N N
# 3 1991-04-29 2015-04-21 N N
# 3 2015-04-22 Y Y
# 4 2005-02-15 Y Y
# 4 2005-02-15 N Y
You could writeFUN
= function(x){if (any('Y' %in% x)) 'Y' else 'N'}
, not using the 'Y' > 'N' trick, for the same result.
Upvotes: 0
Reputation: 323226
We can using any
test3<-df%>%
group_by(UPRN, Start.Date)%>%
dplyr::mutate(Any_Disability = ifelse(any(Disability=="Y"), "Y","N"))
test3
# A tibble: 8 x 5
# Groups: UPRN, Start.Date [5]
UPRN Start.Date End.Date Disability Any_Disability
<int> <chr> <chr> <chr> <chr>
1 1 2006-12-20 17-NOV-17 Y Y
2 1 2006-12-20 17-NOV-17 N Y
3 2 1991-12-06 <NA> N N
4 2 1991-12-06 <NA> N N
5 3 1991-04-29 2015-04-21 N N
6 3 2015-04-22 <NA> Y Y
7 4 2005-02-15 <NA> Y Y
8 4 2005-02-15 <NA> N Y
Upvotes: 3