Reputation: 1563
Lets I have data frame like this:
df <- structure(list(subjecttaxnoid = c("22661187010", "10346575807",
"22439110996", "63510438612", "85267957976", "40178118040", "51246665873",
"66803849969", "45813719599", "26979059418", "11240408751"),
reportyear = c(2014L, 2014L, 2014L, 2008L, 2008L, 2008L,
2008L, 2013L, 2013L, 2013L, 2013L), b001 = c(0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0), b002 = c(0, 3.43884233571018e-07, 7.24705810574303e-08,
1.41222784374111e-07, 1.62917712565032e-05, 0, 4.53310814208705e-07,
7.63856039195011e-06, 0, 0, 0)), .Names = c("subjecttaxnoid",
"reportyear", "b001", "b002"), row.names = c(1L, 2L, 3L, 200000L,
200001L, 200002L, 200003L, 40000L, 40001L, 40002L, 40003L), class = "data.frame")
and the vector that containt names of two columns of df:
x <- c("b001", "b002")
I would like to use components of x instead of columns names in dplyr:
my_list <- list()
for (i in 1:length(x)){
my_list[[1]] <- df %>% group_by(reportyear) %>% top_n(2, wt = x[1])
}
This returns an error:
Error in eval(substitute(expr), envir, enclos) :
Unsupported use of matrix or array for column indexing
Could you please help with this issue?
Upvotes: 0
Views: 1885
Reputation: 14346
I don't think there is an easy way around this (e.g. by wrapping x[1]
inside as.name
) unless you want to change the function top_n
. The reason like @ulfelder suggested in the comments is that dplyr
uses non-standard evaluation, so it expects an unquoted variable name in this case. Other functions have versions to handle quoted arguments (e.g. mutate_
, rename_
, etc) but not in this case.
The easiest way around it would be to use a temporary assignment , e.g.
df %>%
group_by(reportyear) %>%
mutate_(tempvar = x[1]) %>%
top_n(2, wt = tempvar) %>%
select(-tempvar)
(of course you need to ensure tempvar
is not a variable name already in your data frame or it will overwrite an existing variable).Far from ideal and you may have thought about this already and rejected it.
Another way is to define your own top_n_
function which is like top_n
but expects a string in the wt
argument:
top_n_ <- function (x, n, wt) {
wt <- as.name(wt)
stopifnot(is.numeric(n), length(n) == 1)
if (n > 0) {
call <- substitute(filter(x, min_rank(desc(wt)) <= n),
list(n = n, wt = wt))
}
else {
call <- substitute(filter(x, min_rank(wt) <= n), list(n = abs(n),
wt = wt))
}
eval(call)
}
This is basically just taking top_n
and changing the handling of the wt
argument, at the top of the function definition. Then you can do
df %>% group_by(reportyear) %>% top_n_(2, wt = x[1])
identical(
df %>% group_by(reportyear) %>% top_n_(2, wt = x[1]),
df %>% group_by(reportyear) %>% top_n(2, wt = b001),
)
# TRUE
identical(
df %>% group_by(reportyear) %>% top_n_(2, wt = x[2]),
df %>% group_by(reportyear) %>% top_n(2, wt = b002),
)
# TRUE
Upvotes: 1