Reputation: 399
I have a list:
pr <- list(x = c("a", "b", "c"),
y = c("a", "b"),
z = c("a"))
and a data frame df
:
> dput(df)
structure(list(m = c("x", "y", "x", "y", "x", "x", "z", "y",
"z"), order = c(2, 3, 0, 0, 0, 0, 2, 0, 0), a = c(0, 0, -1, -1,
0, 0, 0, -1, -1), b = c(0, 0, 0, 0, -1, 0, 0, 0, 0), c = c(0,
0, 0, 0, 0, -1, 0, 0, 0)), .Names = c("m", "order", "a", "b",
"c"), row.names = c(NA, -9L), class = c("tbl_df", "tbl", "data.frame"
))
which looks as following
> dff
# A tibble: 9 x 5
m order a b c
<chr> <dbl> <dbl> <dbl> <dbl>
1 x 2.00 0 0 0
2 y 3.00 0 0 0
3 x 0 -1.00 0 0
4 y 0 -1.00 0 0
5 x 0 0 -1.00 0
6 x 0 0 0 -1.00
7 z 2.00 0 0 0
8 y 0 -1.00 0 0
9 z 0 -1.00 0 0
Now, if the value in order
is larger than zero, check the corresponding value in m
and add the order
-value only to those columns which names correspond to the value of m
in the list pr
.
So, the desired output should look like
m order a b c
<chr> <dbl> <dbl> <dbl> <dbl>
1 x 2.00 2.00 2.00 2.00 (since x = c("a", "b", "c")
2 y 3.00 3.00 3.00 0 (since y = c("a", "b")
3 x 0 -1.00 0 0
4 y 0 -1.00 0 0
5 x 0 0 -1.00 0
6 x 0 0 0 -1.00
7 z 2.00 2.00 0 0 (since z = c("a")
8 y 0 -1.00 0 0
9 z 0 -1.00 0 0
I've tried to attack this using mutate_at
, quosures, !!
but now I'm stuck.
Any help would be very much appreciated. Thank you in advance!
Upvotes: 1
Views: 72
Reputation: 14764
What about:
library(tidyverse)
dynamic_function <- function(df, list_var, m_var, order_var, ...) {
group_var <- quos(...)
order_var <- enquo(order_var)
byvar1 <- enquo(m_var)
byvar2 <- "key"
by <- setNames(quo_name(byvar2), quo_name(byvar1))
list_var <- data.frame(sapply(list_var, '[', seq(max(lengths(list_var))))) %>%
gather() %>% na.omit()
df_gathered <- df %>%
mutate(rown = row_number()) %>%
gather(key, value, !!! group_var) %>%
left_join(list_var, by = by) %>%
filter(key == value.y) %>%
group_by(!! byvar1, !! order_var) %>%
mutate(
value = case_when(
!! order_var > 0 ~ !! order_var,
TRUE ~ value.x
)
) %>% ungroup() %>% distinct(!! byvar1, !! order_var, key, value, rown) %>%
spread(key, value) %>%
group_by(!! byvar1, !! order_var, rown) %>%
replace(., is.na(.), 0) %>%
summarise_at(vars(!!! group_var), funs(sum)) %>%
arrange(rown) %>% select(-rown) %>% ungroup()
return(df_gathered)
}
You can call this function like:
dfs <- dynamic_function(df, list_var = pr, m_var = m, order_var = order, a, b, c)
Where df
is you dataframe name, list_var
is your list name, m_var
is the name of m column, order_var
is the name of order column, and a, b, c
are dynamic columns you want (you can add d, e, f
...).
Output:
# A tibble: 9 x 5
m order a b c
<chr> <dbl> <dbl> <dbl> <dbl>
1 x 2 2 2 2
2 y 3 3 3 0
3 x 0 -1 0 0
4 y 0 -1 0 0
5 x 0 0 -1 0
6 x 0 0 0 -1
7 z 2 2 0 0
8 y 0 -1 0 0
9 z 0 -1 0 0
You will get a warning about attributes which you can ignore.
Upvotes: 0
Reputation: 48231
The problem doesn't seem to be straightforward, so my solution is not particularly elegant:
df %>% mutate(row = row_number()) %>%
gather(key, value, -m, -order, -row) %>%
mutate(value = value + order * (order > 0 & mapply(`%in%`, key, pr[m]))) %>%
spread(key, value) %>% select(-row)
First I define row
as an auxiliary variable for using spread
later. Now that all the values of a
, b
, c
are in a single column, simply mutate
can be used. Then we go back.
Simply using a loop I guess is more concise than most if not all solutions in this case:
for(r in which(df$order > 0))
df[r, pr[[df$m[r]]]] <- df[r, pr[[df$m[r]]]] + df$order[r]
Note that neither of the solutions mentions a
, b
, c
so that a large number of columns is not an issue.
Upvotes: 1