Reputation: 293
I am trying to create a new column based other column criteria where my data looks like the following:
ID Column 1 Column 2 Column 3
1 2 Y "2013-10-22T10:09"
1 2 Y "2013-10-23T10:09"
2 3 N "2013-10-23T10:09"
3 0 Y "2013-10-23T10:09"
For each ID, I would like to keep only the earliest date/time as long as column 1 is greater than 0 and column 2 is not N. The results would look like this:
ID Column 1 Column 2 Column 3 Column 4
1 2 Y "2013-10-22T10:09" 2013-10-22
I currently tried this but I was wondering how to do it and if there is an elegant way of doing it:
library(dplyr)
ifelse(Column 1 >0 and Column 2 !="N",
(new %>%
group_by(ID) %>%
arrange(Column 3) %>%
slice(1L)))
Column 4 <- as.Date(Column 3, format='%Y-%m-%dT%H:%M')
Upvotes: 1
Views: 269
Reputation: 56
rm(list = ls())
df <- data.frame(id = c(1,1,2,3),column_1 = c(2,2,3,0),
column_2 = c("Y","Y","N","Y"),
column_3 = as.Date(c("2013-10-22","2013-10-23","2013-10-23","2013-10-23"),format = "%Y-%m-%d"))
n <- unique(df$id)
datalist <- list()
for(i in 1:n)
{
z <- df[df$id == i & df$column_1 > 0 & df$column_2 != "N" & df$column_3 == min(df$column_3),]
datalist[[i]] <- z
}
do.call(rbind,datalist)
This function will help you. But the constraints for each column were made constant. You can change it as per your convenience.
Thanks
Upvotes: 1
Reputation: 28705
library(dplyr)
df %>%
filter(Column1 > 0 & Column2 != 'N') %>% # filter out non-matching rows
group_by(ID) %>%
top_n(-1, Column3) %>% # select only the row with the earliest date-time
mutate(Date = as.Date(Column3)) # create date column
#
# # A tibble: 1 x 5
# # Groups: ID [1]
# ID Column1 Column2 Column3 Date
# <int> <int> <chr> <chr> <date>
# 1 1 2 Y 2013-10-22T10:09 2013-10-22
Upvotes: 3