Reputation: 1353
I have the following example data.frame
.
df = data.frame(a=c(rep("a",8), rep("b",5), rep("c",7), rep("d",10)),
b=rnorm(30, 6, 2),
c=rnorm(30, 12, 3.5),
d=rnorm(30, 8, 3)
)
For each column, I would like to calculate z scores per subgroup defined in column a. This post was helpful for me and I can now do this using the following:
df$b.zscore <- ave(df$b, df$a, FUN = scale)
df$c.zscore <- ave(df$c, df$a, FUN = scale)
df$d.zscore <- ave(df$d, df$a, FUN = scale)
But my real data has many more columns. Is there a more elegant way to accomplish this for columns b-d? Maybe using a for
loop? How could I do that, please? I hope anyone can help. Thank you.
Upvotes: 0
Views: 135
Reputation: 388807
You can use lapply
over the columns :
cols <- c('b', 'c', 'd')
new_cols <- paste0(cols, '_zscore')
df[new_cols] <- lapply(df[cols], function(x) ave(x, df$a, FUN = scale))
However, such operations which operate on multiple columns are better done with dplyr
library(dplyr)
df %>%
group_by(a) %>%
mutate(across(b:d, list(zscore = ~as.numeric(scale(.)))))
#For dplyr < 1.0.0
#mutate_at(vars(b:d), list(zscore = ~as.numeric(scale(.))))
and with data.table
:
library(data.table)
setDT(df)[, (new_cols) := lapply(.SD, function(x) as.numeric(scale(x))), a,
.SDcols = cols]
Upvotes: 1