dplyr/purrr iterate over columns as well as rows

Question

I'm trying to drop (set to NA) values in 1 column, based on values in another column; and to do this over a large set of columns. The idea is to then pass the data to a plotting function, to generate different plots for different cuts of the data.

Here's a reproducible example:

d <- data.frame("A_agree" = sample(1:7, 20, replace=T),
                "B_agree" = sample(1:7, 20, replace=T),
                "C_agree" = sample(1:7, 20, replace=T),
                "A_change" = sample(1:5, 20, replace=T),
                "B_change" = sample(1:5, 20, replace=T),
                "C_change" = sample(1:5, 20, replace=T))

I've already found the following solution using base R, but it's of course slow, and I'm trying to learn more and more dplyr, so was wondering how to achieve this in dplyr

d.positive <- d
for (n in (c("A","B","C"))) {
  for (i in 1:nrow(d.positive)) {
    d.positive[i, paste0(n, "_agree")] <- ifelse(d.positive[i, paste0(n, "_change")] > 3,
                                                 d.positive[i, paste0(n, "_agree")],
                                                 NA)
  }
}
d.neutral <- d
for (n in (c("A","B","C"))) {
  for (i in 1:nrow(d.neutral)) {
    d.neutral[i, paste0(n, "_agree")] <- ifelse(d.neutral[i, paste0(n, "_change")] == 3,
                                                 d.neutral[i, paste0(n, "_agree")],
                                                 NA)
  }
}
d.negative <- d
for (n in (c("A","B","C"))) {
  for (i in 1:nrow(d.negative)) {
    d.negative[i, paste0(n, "_agree")] <- ifelse(d.negative[i, paste0(n, "_change")] < 3,
                                                 d.negative[i, paste0(n, "_agree")],
                                                 NA)
  }
}

I thought I would use gather(), and then check for each row whether the corresponding column (hence the !!dimension) is bigger than a certain value (3 in this case), but it doesn't seem to work?

d %>%
  gather(dimension,
         value,
         paste0(c("A","B","C"), "_agree")
         ) %>%
  case_when(!!dimension > 3 ~ value=NA)

Alternatively, I thought I'd use map2_dfr from purrr, but I don't think it iterates over cells, just takes the entire column, hence this doesn't work:

map2_dfr(.x = d %>%
                 select( paste0(c("A","B","C"), "_agree") ),
         .y = d %>%
                 select( paste0(c("A","B","C"), "_change") ),
         ~ if_else(.y > 3, x, NA)} )

Any pointers would be really helpful, to keep learning about the wonderful world of dplyr !

Humpelstielzchen · Accepted Answer

I get that you want to learn about purrr, but base R is just easier here:

d.positive <- d  

check  <- d.positive[4:6] <= 3 #it's the same condition
d.positive[,1:3][check] <- NA

> d.positive
   A_agree B_agree C_agree A_change B_change C_change
1        1      NA      NA        4        3        2
2        2       2      NA        4        5        2
3        4      NA      NA        4        3        1
4        1      NA      NA        4        1        2
5       NA       1      NA        2        4        1
6       NA       7      NA        3        5        1
7       NA       6      NA        1        5        1
8       NA       6       4        2        5        5
9        4      NA      NA        4        1        2
10       1      NA      NA        5        1        2
11      NA      NA      NA        3        1        2
12      NA      NA      NA        1        3        3
13      NA      NA      NA        1        1        1
14      NA      NA      NA        3        2        3
15       1      NA      NA        5        3        3
16       2      NA      NA        4        3        2
17      NA      NA       6        1        1        4
18      NA      NA      NA        1        1        2
19      NA      NA      NA        2        3        1
20      NA      NA      NA        1        3        1

dplyr/purrr iterate over columns as well as rows

Answers (2)

Related Questions