Reputation: 578
I am learning to use the dplyr pkg.
library(dplyr)
A toy dataset:
d <- expand.grid("id"=1:3,"x1"=10:12,"x2"=(20:22))
Later I may need to loop through the columns, my real data has 30K rows, 70 columns
i <- 2
here I am hoping to use a generic variable name
my.variable <- names(d[i])
my.variable
A function to normalize each group to the range 0-1
norm <- function(x) (x - min(x,na.rm = TRUE))/(max(x,na.rm = TRUE)-min(x,na.rm = TRUE))
df.out <- d %>% group_by(id) %>% mutate(x.norm = norm(get(my.variable, envir = as.environment(d))))
throws an error:
Error: incompatible size (%d), expecting %d (the group size) or 1
Any help appreciated as to the reason for the error. Also, is this a viable way of doing this normalizing task?
Upvotes: 0
Views: 603
Reputation: 66874
The problem comes from the use of get
, which I'm sure is a breach of the @hadley license agreement ;)
To evaluate character arguments, you can use mutate_each_q
. However, when using a single function, it will overwrite the variable, so you must use two functions and drop the second variable afterwards:
d %>% group_by(id) %>% mutate_each_q(funs(x.norm=norm, identity),my.variable) %>%
select(-identity)
Source: local data frame [6 x 4]
Groups: id
id x1 x2 x.norm
1 1 10 20 0.0
2 2 10 20 0.0
3 3 10 20 0.0
4 1 11 20 0.5
5 2 11 20 0.5
6 3 11 20 0.5
...
Upvotes: 2
Reputation: 10215
Don't know if you really want the columns as in @James' answer. Here as I understand your question:
d %>% group_by(id) %>% mutate_each(funs(norm(.)))
Groups: id
id x1 x2
1 1 0.0 0.0
2 2 0.0 0.0
3 3 0.0 0.0
...
Upvotes: 2