Reputation: 549
This is an extension of the question I asked here where I was looking for a way to automate my labeling of subjects into groups based on if their data matched my filter.
Prior to attempting to the automating labeling, this is what I had.
library(tidyverse)
df <- structure(list(Subj_ID = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L),
Location = c(1, 2, 3, 1, 4, 2, 1, 2, 5)), class = "data.frame",
row.names = c(NA, -9L))
df2 <- df %>%
mutate(group=
if_else(Subj_ID ==1,
"Treatment",
if_else(Subj_ID == 2,
"Control","Withdrawn")))
complete.df <- df2 %>% filter(complete.cases(.))
In my actual data, there are some rows that have NA's and I need to be able to filter for both complete and incomplete cases so I can review the sub-data sets separately if needed. My new code looks like this which assigns a subject to a group based on if they have a location data point 4 or 5:
df2 <- df %>%
mutate(group=
if_else(Subj_ID ==1,
"Treatment",
if_else(Subj_ID == 2,
"Control","Withdrawn")))
df3 <- df2 %>% ##this chunk breaks filter(complete.cases(.))
group_by(Subj_ID) %>%
mutate(group2 = case_when(any(Location == 4) | any(Location == 5) ~ "YES", TRUE ~ "NO"))
complete.df <- df3 %>% filter(complete.cases(.))
Once I generate df3 by mutating df2, my filter(complete.cases(.)) subsequently fails.
Yet, if I were to generate df3 by manual recoding, it works! As so:
df2 <- df %>%
mutate(group=
if_else(Subj_ID ==1,
"Treatment",
if_else(Subj_ID == 2,
"Control","Withdrawn")))
df3 <- df2 %>%
mutate(group2=
if_else(Subj_ID ==2 |
Subj_ID ==3,
"TRUE", "FALSE"))
complete.df <- df3 %>% filter(complete.cases(.))
Thoughts?
Upvotes: 2
Views: 667
Reputation: 887058
It would be the group_by
attribute which causes the issue and can be solved by ungroup
ing and then apply the filter
. In the OP's last code block (manual coding), it is not creating a grouping attribute and thus it works
library(dplyr)
df3 %>%
ungroup %>%
filter(complete.cases(.))
Or instead of complete.cases
in filter
, we can use !is.na
with filter_all
without removing the grouping attribute
df3 %>%
filter_all(any_vars(!is.na(.)))
OP mentioned about the last code block is working, but it doesn't have any group attribute. If we create one, then it fails too
df3 %>%
group_by(group) %>%
filter(complete.cases(.))
Error: Result must have length 3, not 9
Upvotes: 3