Reputation: 1
I can't seem to get step_corr() to function inside a recipe.
Minimal example:
df <- data.frame(x1=runif(10)) %>%
mutate(x2=x1+1) %>%
mutate(y=x1+rnorm(10))
cor(df)
rec <- recipe(y~x1+x2, data = df) %>%
step_corr(threshold=0.9) %>%
prep(df)
bake(rec, new_data=df)
What am I doing wrong or misunderstanding? Thank you.
Upvotes: 0
Views: 40
Reputation: 3185
You forgot to selector variables in step_corr()
. All steps allow for empty selections which does nothing
library(recipes)
df <- data.frame(x1=runif(10)) %>%
mutate(x2=x1+1) %>%
mutate(y=x1+rnorm(10))
cor(df)
#> x1 x2 y
#> x1 1.0000000 1.0000000 0.6882089
#> x2 1.0000000 1.0000000 0.6882089
#> y 0.6882089 0.6882089 1.0000000
rec <- recipe(y~x1+x2, data = df) %>%
step_corr(all_predictors(), threshold=0.9) %>%
prep(df)
bake(rec, new_data=df)
#> # A tibble: 10 × 2
#> x2 y
#> <dbl> <dbl>
#> 1 1.06 -0.353
#> 2 1.53 -0.951
#> 3 1.87 2.51
#> 4 1.43 -0.288
#> 5 1.60 0.696
#> 6 1.64 0.296
#> 7 1.31 1.16
#> 8 1.07 -1.37
#> 9 1.49 -0.215
#> 10 1.70 1.16
Created on 2024-08-05 with reprex v2.1.0
Upvotes: 1