Reputation: 911
I'm trying to drop (set to NA) values in 1 column, based on values in another column; and to do this over a large set of columns. The idea is to then pass the data to a plotting function, to generate different plots for different cuts of the data.
Here's a reproducible example:
d <- data.frame("A_agree" = sample(1:7, 20, replace=T),
"B_agree" = sample(1:7, 20, replace=T),
"C_agree" = sample(1:7, 20, replace=T),
"A_change" = sample(1:5, 20, replace=T),
"B_change" = sample(1:5, 20, replace=T),
"C_change" = sample(1:5, 20, replace=T))
I've already found the following solution using base R, but it's of course slow, and I'm trying to learn more and more dplyr
, so was wondering how to achieve this in dplyr
d.positive <- d
for (n in (c("A","B","C"))) {
for (i in 1:nrow(d.positive)) {
d.positive[i, paste0(n, "_agree")] <- ifelse(d.positive[i, paste0(n, "_change")] > 3,
d.positive[i, paste0(n, "_agree")],
NA)
}
}
d.neutral <- d
for (n in (c("A","B","C"))) {
for (i in 1:nrow(d.neutral)) {
d.neutral[i, paste0(n, "_agree")] <- ifelse(d.neutral[i, paste0(n, "_change")] == 3,
d.neutral[i, paste0(n, "_agree")],
NA)
}
}
d.negative <- d
for (n in (c("A","B","C"))) {
for (i in 1:nrow(d.negative)) {
d.negative[i, paste0(n, "_agree")] <- ifelse(d.negative[i, paste0(n, "_change")] < 3,
d.negative[i, paste0(n, "_agree")],
NA)
}
}
I thought I would use gather()
, and then check for each row whether the corresponding column (hence the !!dimension
) is bigger than a certain value (3
in this case), but it doesn't seem to work?
d %>%
gather(dimension,
value,
paste0(c("A","B","C"), "_agree")
) %>%
case_when(!!dimension > 3 ~ value=NA)
Alternatively, I thought I'd use map2_dfr
from purrr
, but I don't think it iterates over cells, just takes the entire column, hence this doesn't work:
map2_dfr(.x = d %>%
select( paste0(c("A","B","C"), "_agree") ),
.y = d %>%
select( paste0(c("A","B","C"), "_change") ),
~ if_else(.y > 3, x, NA)} )
Any pointers would be really helpful, to keep learning about the wonderful world of dplyr
!
Upvotes: 1
Views: 123
Reputation: 6441
I get that you want to learn about purrr
, but base R
is just easier here:
d.positive <- d
check <- d.positive[4:6] <= 3 #it's the same condition
d.positive[,1:3][check] <- NA
> d.positive
A_agree B_agree C_agree A_change B_change C_change
1 1 NA NA 4 3 2
2 2 2 NA 4 5 2
3 4 NA NA 4 3 1
4 1 NA NA 4 1 2
5 NA 1 NA 2 4 1
6 NA 7 NA 3 5 1
7 NA 6 NA 1 5 1
8 NA 6 4 2 5 5
9 4 NA NA 4 1 2
10 1 NA NA 5 1 2
11 NA NA NA 3 1 2
12 NA NA NA 1 3 3
13 NA NA NA 1 1 1
14 NA NA NA 3 2 3
15 1 NA NA 5 3 3
16 2 NA NA 4 3 2
17 NA NA 6 1 1 4
18 NA NA NA 1 1 2
19 NA NA NA 2 3 1
20 NA NA NA 1 3 1
Upvotes: 2
Reputation: 701
I would suggest to use tidyr
package in combination with dplyr
. In it there are new functions pivot_longer
and pivot_wider
which replace older gather
and spread
.
Using a combination of both the solution could be as follows:
d.neutral1 =
d %>%
mutate(row = row_number() ) %>%
pivot_longer(-row, names_sep = "_", names_to = c("name","type") ) %>%
pivot_wider(names_from = type, values_from = value) %>%
mutate(result = if_else(change == 3, agree, NA_integer_))
and if you want a similar shape to the original
d.neutral1 %>%
select(-agree, -change) %>%
pivot_wider(names_from = name, values_from = result)
Upvotes: 1