piper180
piper180

Reputation: 361

How to specify response using recipe package in Tidymodels

I am creating a recipe so that I first create a calculated column called "response" as so:

rec <- recipe( ~., data = training) %>%
  step_mutate(response = as.integer(all(c('A', 'B') %in% Col4) & Col4 == 'A'))

I would like to now specify this new calculated column as the response variable in the recipe() function as shown below. I will be doing a series of operations on it such as this first one with step_naomit. How do I re-specify my response in recipe() to be the calculated column from my previous step (above) using recipes?

recipe <- recipe(response ~ ., data = training) %>%
          step_naomit(recipe, response)

Upvotes: 1

Views: 330

Answers (2)

EmilHvitfeldt
EmilHvitfeldt

Reputation: 3185

This is related to tidymodel error, when calling predict function is asking for target variable

It is generally not advisable to modify the response inside your recipe. This is because the response variable won't be available to the recipe in certain cases, such as when using {tune}. I would recommend that you perform this transformation before you pass the data to the recipe. Even better if you do it before the validation split.

set.seed(1234)
data_split <-  my_data %>%
  step_mutate(response = as.integer(all(c('A', 'B') %in% Col4) & Col4 == 'A')) %>%
  initial_split()

training <- training(data_split)
testing <- testing(data_split)

rec <- recipe(response ~., data = training)

Upvotes: 3

MrFlick
MrFlick

Reputation: 206232

You can set the role for new columns in the step_mutate() function by explictly setting the role= parmaeter.

rec <- recipe( ~., data = iris) %>%
  step_mutate(SepalSquared= Sepal.Length ^ 2, role="outcome")

Then check that it worked with summary(prep(rec))

  variable     type    role      source  
  <chr>        <chr>   <chr>     <chr>   
1 Sepal.Length numeric predictor original
2 Sepal.Width  numeric predictor original
3 Petal.Length numeric predictor original
4 Petal.Width  numeric predictor original
5 Species      nominal predictor original
6 SepalSquared numeric outcome   derived 

Upvotes: 3

Related Questions