Reputation: 4388
I have a dataframe, df, I wish to filter one of the columns within this dataset to only reveal when there is a value and to remove the blank values.
Name Edit Folder Message Date
Hello T Out 1/5/2020 5:00:00 AM
Hi T Out 1/5/2020 5:00:02 AM
T Out 1/5/2020 5:00:03 AM
Bye T Out 1/5/2020 5:00:04 AM
See you! T drafts 1/5/2020 5:00:05 AM
I wish to have this output:
Name Edit Folder Message Date
Hello T Out 1/5/2020 5:00:00 AM
Hi T Out 1/5/2020 5:00:02 AM
Bye T Out 1/5/2020 5:00:04 AM
See you! T drafts 1/5/2020 5:00:05 AM
So essentially the row with the empty Name value was removed.
This is how I am filtering:
df1<-df %>%
mutate(Date = lubridate::mdy_hms(Date),
cond = Edit == "True" & Name !== "" & Folder == "Out" | Folder == "drafts" & Message == "" ,
grp = cumsum(!cond)) %>%
filter(cond) %>%
group_by(grp) %>%
summarise(starttime = first(Date),
endtime = last(Date),
duration = difftime(endtime, starttime, units = "secs")) %>%
select(-grp)
How would I incorporate if Name has a value, keep this and discard the others within this code?
dput:
structure(list(Name = structure(c(3L, 4L, 1L, 2L, 5L), .Label = c("",
"Bye", "Hello", "Hi", "See you!"), class = "factor"), Edit = c(TRUE,
TRUE, TRUE, TRUE, TRUE), Folder = structure(c(2L, 2L, 2L, 2L,
1L), .Label = c("drafts", "Out"), class = "factor"), Message = c(NA,
NA, NA, NA, NA), Date = structure(1:5, .Label = c("1/5/2020 5:00:00 AM",
"1/5/2020 5:00:02 AM", "1/5/2020 5:00:03 AM", "1/5/2020 5:00:04 AM",
"1/5/2020 5:00:05 AM"), class = "factor")), class = "data.frame", row.names = c(NA,
-5L))
Upvotes: 1
Views: 41