How to replace data in current columns using mutate?

Question

I want to group my dataframe by year and standardize certain columns (In this case BioTest, MathExam, and WritingScore) and replace the old data with the new data.Below is an example of my dataframe:

DF:

Var1   Var2   Year  BioTest   MathExam   WritingScore   Var3  Var 4
 X      X     2016   165        140         10           X     X
 X      X     2017   172        128         11           X     X
 X      X     2018   169        115          8           X     X
 X      X     2016   166        139         10           X     X
 X      X     2017   165        140         12           X     X

I have tried variations of the following code:

DF<- DF %>% group_by(Year)%>% mutate(across(BioTest:WritingScore),scale)

DF<- DF %>% group_by(Year)%>% mutate(across(select(BioTest:WritingScore)),scale)

What I get in return is the same DF without any changes. What I want is:

 DF:

 Var1   Var2   Year  BioTest   MathExam   WritingScore   Var3  Var 4
 X      X     2016   NewData     NewData      NewData      X     X
 X      X     2017   NewData     NewData      NewData      X     X
 X      X     2018   NewData     NewData      NewData      X     X
 X      X     2016   NewData     NewData      NewData      X     X
 X      X     2017   NewData     NewData      NewData      X     X

Any help is much appreciated.

akrun · Accepted Answer

The issue could be that dplyr::mutate was masked by the plyr::mutate. It can be reproduced with (along with the fact that across is closed without a function)

iris %>%
    group_by(Species) %>%
    plyr::mutate(across(where(is.numeric), scale))
# A tibble: 150 x 5
# Groups:   Species [3]
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#                                   
# 1          5.1         3.5          1.4         0.2 setosa 
# 2          4.9         3            1.4         0.2 setosa 
# 3          4.7         3.2          1.3         0.2 setosa 
# 4          4.6         3.1          1.5         0.2 setosa 
# 5          5           3.6          1.4         0.2 setosa 
# 6          5.4         3.9          1.7         0.4 setosa 
# 7          4.6         3.4          1.4         0.3 setosa 
# 8          5           3.4          1.5         0.2 setosa 
# 9          4.4         2.9          1.4         0.2 setosa 
#10          4.9         3.1          1.5         0.1 setosa 
# … with 140 more rows

which is the same as the initial 'iris' dataset

Now, check with the correct dplyr::mutate

iris %>% 
   group_by(Species) %>%
   dplyr::mutate(across(where(is.numeric), scale))
# A tibble: 150 x 5
# Groups:   Species [3]
#   Sepal.Length[,1] Sepal.Width[,1] Petal.Length[,1] Petal.Width[,1] Species
#                                                   
# 1           0.267           0.190            -0.357          -0.436 setosa 
# 2          -0.301          -1.13             -0.357          -0.436 setosa 
# 3          -0.868          -0.601            -0.933          -0.436 setosa 
# 4          -1.15           -0.865             0.219          -0.436 setosa 
# 5          -0.0170          0.454            -0.357          -0.436 setosa 
# 6           1.12            1.25              1.37            1.46  setosa 
# 7          -1.15           -0.0739           -0.357           0.512 setosa 
# 8          -0.0170         -0.0739            0.219          -0.436 setosa 
# 9          -1.72           -1.39             -0.357          -0.436 setosa 
#10          -0.301          -0.865             0.219          -1.39  setosa 
# … with 140 more rows

So, in the OP's code, we just need to use dplyr::mutate or restart a fresh R session with only dplyr loaded

DF %>% 
   group_by(Year)%>% 
   dplyr::mutate(across(BioTest:WritingScore, scale))

scale returns a matrix with some attributes. If we only need the numeric vector part, we can either use as.vector or as.numeric

DF %>% 
   group_by(Year)%>% 
   dplyr::mutate(across(BioTest:WritingScore, ~ as.numeric(scale(.)))

NOTE: The select is not needed within across

How to replace data in current columns using mutate?

Answers (2)

Related Questions