Nneka
Nneka

Reputation: 1860

Alternative to cut function in R for data.tables - integer variables to factors

I want to convert the integer variable hp to a categorical variable, cut by 10.

mtcars[, hp_cat := cut(hp, 
    breaks = c(0, 10, 20, 30 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, Inf), 
include.lowest = TRUE )]

This yields the desired result, however it is query tedious to write out all the numbers. Is there an faster way? Also ideally the alternative would result in nicer factor names too.

Attention: I would like to have the result in data.table... so NO dplyr.

Upvotes: 4

Views: 824

Answers (2)

chinsoon12
chinsoon12

Reputation: 25225

Another option which should be faster:

mtcars[, hp_cat2 := ceiling(hp/10)*10][hp_cat2 > 160, hp_cat2 := Inf]

Using the right limit as the naming for your nicer factor names

Upvotes: 0

MatthewR
MatthewR

Reputation: 2770

Just use the sequence function. Depending what the situation is you may a -Inf as the first element in the vector. Also the label parameter will allow you to assign names, this works in the code below: labels = paste0("Group",2:length(BRKS))

BRKS <-    c( seq( 0 , 160, 10 ) , Inf )

mtcars[, hp_cat := cut(hp, breaks = BRKS , include.lowest = TRUE )]

Upvotes: 4

Related Questions