Reputation: 2306
I have a small trouble with my df. Firstly I'll show you an example, then explain what I'd like to recieve.
My input df:
C1 C2 C3 C4 C5 C6 C7 C8
A I I D X I I I
A I I I X D I I
A I I I X I I I
A I D I X NC I I
B D D I X I I I
B D I NC X I I D
C NC I I X NC D I
C I I I X I I I
C I I I X I I D
D NC NC I X D D D
D I I I X D D I
D D D I X I I NC
D I I I X NC I I
E NC I I X I I D
E I I I X I D D
Desired result:
C1 C2 C3 C4 C5 C6 C7 C8
A I I D X I I I
A I I I X D I I
A I I I X I I I
A I D I X NC I I
I'd like to have only groups (group by column 'C1'
) (with all rows) where there're at least 2 occurances of 'I'
in each row of group (let's take A
) in group columns C2, C3, C4
and C6, C7, C8
.
I decided to use filter()
, all()
and rowSums()
df_filtered <- df %>%
group_by(C1) %>%
filter(all(rowSums(df[,2:4] == 'I' & df[,6:8] == 'I') >= 2))
wht doesn't it work? It returns 0 rows, no idea why...
Upvotes: 1
Views: 49
Reputation: 16940
df %>%
mutate(condition = rowSums(.[2:4] == 'I') >= 2 & rowSums(.[6:8] == 'I') >= 2) %>%
group_by(C1) %>%
filter(all(condition)) %>%
select(-condition)
# A tibble: 4 x 8
# Groups: C1 [1]
C1 C2 C3 C4 C5 C6 C7 C8
<fctr> <fctr> <fctr> <fctr> <fctr> <fctr> <fctr> <fctr>
1 A I I D X I I I
2 A I I I X D I I
3 A I I I X I I I
4 A I D I X NC I I
When you used
filter(all(rowSums(df[,2:4] == 'I' & df[,6:8] == 'I') >= 2))
the all()
comparison was on all rows of df
, not just the ones from your group. This approach evaluates the condition for each row, then calls all()
only on the group.
Upvotes: 2
Reputation: 310
you could try with unite()
and then filtering by regular expressions. Here your example:
library(tidyverse)
# First loading your data
data <-read.table(text = "C1 C2 C3 C4 C5 C6 C7 C8
A I I D X I I I
A I I I X D I I
A I I I X I I I
A I D I X NC I I
B D D I X I I I
B D I NC X I I D
C NC I I X NC D I
C I I I X I I I
C I I I X I I D
D NC NC I X D D D
D I I I X D D I
D D D I X I I NC
D I I I X NC I I
E NC I I X I I D
E I I I X I D D", header = T)
# Then filtering rows
data %>%
# Creating a helper column
unite(merged, C1:C8, sep = "", remove = F) %>%
# Filtering by regexp
filter(grepl("^A", merged), grepl("II", merged)) %>%
# Deleting helper column
select(-merged)
C1 C2 C3 C4 C5 C6 C7 C8
1 A I I D X I I I
2 A I I I X D I I
3 A I I I X I I I
4 A I D I X NC I I
have fun ;)
Upvotes: 0