How to transform (standardize) data within categories in a single data frame in R?

Question

I need to scale values only within certain categories. Essentially I have a data.frame with individuals, years, and several predictors.

Individual	Year	Uplift
A	2013	0.76280999
A	2013	1.01930776
A	2015	0.00000000
B	2011	0.78427964
B	2013	0.00000000
B	2013	1.37627043

I need to scale my predictors within each individual and year, in other words standardize Individual A year 2013, Individual A year 2015 and so on for 33 individuals and 86 thousand rows. Not wishing to do this in separate data frames for each individual and year combination, I tried to use a dplyr solution

library("dplyr")

data %>%

group_by(Individual, Year) %>%

mutate(data, std_uplift= scale(uplift) %>%

ungroup()))

Naturally, this throws an error:

Error: Problem with mutate() input ..1.

x Input ..1 can't be recycled to size 1100.

i Input ..1 is data.

i Input ..1 must be size 1100 or 1, not 83670.

i The error occurred in group 1: Individual = "A", year = "2013".

I don't understand how to fix the error, as it seems to be trying to shove data from all individuals into a single group, but I am guessing that there is a better way to scale data given categories. How can I make this work?

Thanks!

TarJae · Accepted Answer

library("dplyr")
df <- tribble(
  ~Individual,  ~Year,  ~Uplift,
"A", 2013, 0.76280999, 
"A", 2013, 1.01930776, 
"A", 2015, 0.00000000, 
"B", 2011, 0.78427964, 
"B", 2013, 0.00000000, 
"B", 2013, 1.37627043)

df %>% 
  mutate(std_Uplift = as.numeric(scale(Uplift))) %>% 
  ungroup()

# A tibble: 6 x 4
  Individual  Year Uplift std_Uplift[,1]
                    
1 A           2013  0.763          0.190
2 A           2013  1.02           0.653
3 A           2015  0             -1.18 
4 B           2011  0.784          0.229
5 B           2013  0             -1.18 
6 B           2013  1.38           1.30

How to transform (standardize) data within categories in a single data frame in R?

Answers (1)

Related Questions