Reputation: 437
I'am trying use tfdatasets package in R in order to produce a pipeline that takes an tibble/dataframe and outputs a one hot encoded response variable of Species. How do I transform the response variable (y) with tfdatasets in order to output Species as one hot encoded?
Desired output is:
versicolor, setosa, virginica
0, 1, 0 ...
Upvotes: 1
Views: 68
Reputation: 437
As explained in the comment above, this is a workaround that works for my purposes, but is not necessarily a 100% pure tfdatasets solution.
library(tidyverse)
library(lubridate)
library(rsample)
library(recipes)
library(reticulate)
library(tensorflow)
library(tfdatasets)
library(keras)
iris %>%
recipe(Species ~ .) %>%
step_dummy(Species,
one_hot = T) %>%
prep() %>%
juice() %>%
select(contains("Species")) %>%
as.matrix() %>%
tensor_slices_dataset()
The solution has less pure tfdatasets pipeline, whilst the workaround below is a more pure approach.
iris %>%
mutate(Species = Species %>%
as.integer()) %>%
select(Species) %>%
tensor_slices_dataset() %>%
dataset_map(function(iteration){
iteration$Species <- tf$one_hot(iteration$Species,
3L)
iteration
})
Upvotes: 2