Lynn
Lynn

Reputation: 4388

Filter a column to only reveal when there is a value, and remove the others in R (R, dplyr, lubridate)

I have a dataframe, df, I wish to filter one of the columns within this dataset to only reveal when there is a value and to remove the blank values.

     Name                      Edit                 Folder        Message      Date


     Hello                     T                     Out                       1/5/2020 5:00:00 AM   
     Hi                        T                     Out                       1/5/2020 5:00:02 AM
                               T                     Out                       1/5/2020 5:00:03 AM
     Bye                       T                     Out                       1/5/2020 5:00:04 AM
     See you!                  T                     drafts                    1/5/2020 5:00:05 AM

I wish to have this output:

     Name                     Edit                 Folder        Message      Date


     Hello                     T                     Out                       1/5/2020 5:00:00 AM   
     Hi                        T                     Out                       1/5/2020 5:00:02 AM
     Bye                       T                     Out                       1/5/2020 5:00:04 AM
     See you!                  T                     drafts                    1/5/2020 5:00:05 AM

So essentially the row with the empty Name value was removed.

This is how I am filtering:

 df1<-df %>%
 mutate(Date = lubridate::mdy_hms(Date), 
 cond = Edit == "True" & Name !== "" & Folder == "Out" | Folder == "drafts" & Message == "" , 
 grp = cumsum(!cond)) %>%
 filter(cond) %>%
 group_by(grp) %>%
 summarise(starttime = first(Date), 
 endtime = last(Date), 
 duration = difftime(endtime, starttime, units = "secs")) %>%
 select(-grp)

How would I incorporate if Name has a value, keep this and discard the others within this code?

dput:

 structure(list(Name = structure(c(3L, 4L, 1L, 2L, 5L), .Label = c("", 
 "Bye", "Hello", "Hi", "See you!"), class = "factor"), Edit = c(TRUE, 
 TRUE, TRUE, TRUE, TRUE), Folder = structure(c(2L, 2L, 2L, 2L, 
 1L), .Label = c("drafts", "Out"), class = "factor"), Message = c(NA, 
 NA, NA, NA, NA), Date = structure(1:5, .Label = c("1/5/2020 5:00:00 AM", 
"1/5/2020 5:00:02 AM", "1/5/2020 5:00:03 AM", "1/5/2020 5:00:04 AM", 
"1/5/2020 5:00:05 AM"), class = "factor")), class = "data.frame", row.names = c(NA, 
 -5L))

Upvotes: 1

Views: 41

Answers (1)

akrun
akrun

Reputation: 886938

In base R, we can use subset

subset(df1, Name != "")

Upvotes: 2

Related Questions