apply a function to a dynamically changed number of columns for each row

Question

I have a list:

pr <- list(x = c("a", "b", "c"),
           y = c("a", "b"),
           z = c("a"))

and a data frame df:

> dput(df)
structure(list(m = c("x", "y", "x", "y", "x", "x", "z", "y", 
"z"), order = c(2, 3, 0, 0, 0, 0, 2, 0, 0), a = c(0, 0, -1, -1, 
0, 0, 0, -1, -1), b = c(0, 0, 0, 0, -1, 0, 0, 0, 0), c = c(0, 
0, 0, 0, 0, -1, 0, 0, 0)), .Names = c("m", "order", "a", "b", 
"c"), row.names = c(NA, -9L), class = c("tbl_df", "tbl", "data.frame"
))

which looks as following

> dff
# A tibble: 9 x 5
  m     order     a     b     c
      
1 x      2.00  0     0     0   
2 y      3.00  0     0     0   
3 x      0    -1.00  0     0   
4 y      0    -1.00  0     0   
5 x      0     0    -1.00  0   
6 x      0     0     0    -1.00
7 z      2.00  0     0     0   
8 y      0    -1.00  0     0   
9 z      0    -1.00  0     0

Now, if the value in order is larger than zero, check the corresponding value in m and add the order-value only to those columns which names correspond to the value of m in the list pr.

So, the desired output should look like

  m     order     a     b     c
      
1 x      2.00  2.00  2.00  2.00   (since x = c("a", "b", "c")
2 y      3.00  3.00  3.00  0      (since y = c("a", "b")
3 x      0    -1.00  0     0   
4 y      0    -1.00  0     0   
5 x      0     0    -1.00  0   
6 x      0     0     0    -1.00
7 z      2.00  2.00  0     0      (since z = c("a")
8 y      0    -1.00  0     0   
9 z      0    -1.00  0     0

I've tried to attack this using mutate_at, quosures, !! but now I'm stuck.

Any help would be very much appreciated. Thank you in advance!

Julius Vainora · Accepted Answer

The problem doesn't seem to be straightforward, so my solution is not particularly elegant:

df %>% mutate(row = row_number()) %>% 
  gather(key, value, -m, -order, -row) %>%
  mutate(value = value + order * (order > 0 & mapply(`%in%`, key, pr[m]))) %>% 
  spread(key, value) %>% select(-row)

First I define row as an auxiliary variable for using spread later. Now that all the values of a, b, c are in a single column, simply mutate can be used. Then we go back.

Simply using a loop I guess is more concise than most if not all solutions in this case:

for(r in which(df$order > 0))
  df[r, pr[[df$m[r]]]] <- df[r, pr[[df$m[r]]]] + df$order[r]

Note that neither of the solutions mentions a, b, c so that a large number of columns is not an issue.

apply a function to a dynamically changed number of columns for each row

Answers (2)

Related Questions