Alex Kors
Alex Kors

Reputation: 202

Using gsub() in a data.table

I have a big data table (about 20,000 rows). One of its columns contains in integers from 1 to 6.

I also have a character vector of car models (6 models).

I'm trying to replace integers with corresponding car model.(just 2 in this example)

 gsub("1",paste0(labels[1]),Models)
 gsub("2",paste0(labels[2]),Models) 
 ...  

"Models" is the name of a column.

labels <- c("Altima","Maxima")

After fighting with it for 12+ hours gsub() isn't working(

sample data:
mydata<-data.table(replicate(1,sample(1:6,10000,rep=TRUE))) labels<-c("altima","maxima","sentra","is","gs","ls")

Upvotes: 0

Views: 3663

Answers (2)

Jeff Millard
Jeff Millard

Reputation: 173

You could try using factor() in the following way - worked for me on your test data. Assuming that name of the first column in mydata is V1 (the default)

mydata$V1 <- factor(mydata$V1, labels=models)

Upvotes: 0

MrFlick
MrFlick

Reputation: 206391

I don't think you need gsub here. What you are describing is a factor variable.

If you data is

mydata <- data.table(replicate(1,sample(1:6,1000,rep=TRUE)))
models <- c("altima","maxima","sentra","is","gs","ls")

you could just do

mydata[[1]] <- factor(mydata[[1]], levels=seq_along(models), labels=models)

If you really wanted a character rather than a factor, then

mydata[[1]] <- models[ mydata[[1]] ]

would also do the trick. Both of these require the numbers are continuous and start at 1.

Upvotes: 2

Related Questions