NataliaK
NataliaK

Reputation: 85

How to add new categorical variable in h2o data frame

I'm trying to add new categorical variable in the frame h2o. I have created a new variable based on some requirements and I'm trying to get new values into h2o frame, but I'm getting error.

New variable to be added:

late_arrival <- with(flights,
 ifelse(arr_delay>=30,1,
 ifelse(arr_delay<30,0,NA)))
table(late_arrival)

I'm trying to mutate it with existing h2o frame to add this new variable:

 flights_new <- select(flights.hex) %>%
mutate(late_arrival)

Error in UseMethod("select_") : no applicable method for 'select_' applied to an object of class "H2OFrame"

I have also tried collect function:

flights_new <- select (flights.hex, late_arrival) %>% collect()

Error in UseMethod("select_") : no applicable method for 'select_' applied to an object of class "H2OFrame"

How can I add new categorical variable into h2o data frame?

Upvotes: 2

Views: 1023

Answers (1)

phiver
phiver

Reputation: 23598

You either do this change before you load the data into the h2o cluster, or you do the change inside on the h2o cluster side on your flight.hex. See below an example with mtcars.

# change before loading data into h2o:
mtcars$new_condition <- ifelse(mtcars$mpg >= 20, 1, 
                               ifelse(mtcars$mpg <20, 0, NA))

library(h2o)
h2o.init()

mtcars.hex <- as.h2o(mtcars)

# change when data is inside h2o cluster
mtcars.hex$new_condition2 <- ifelse(mtcars.hex$mpg >= 20, 1, 
                                   ifelse(mtcars.hex$mpg <20, 0, NA))

mtcars.hex

   mpg cyl disp  hp drat    wt  qsec vs am gear carb new_condition new_condition2
1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4             1              1
2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4             1              1
3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1             1              1
4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1             1              1
5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2             0              0
6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1             0              0

[32 rows x 13 columns]

Upvotes: 0

Related Questions