Reputation: 13807
I have some data tables with the same structure and I want to make a few data transformations on them (create new variables, assign missing values etc)
This is what I've tried, without success. This codes runs ok but it does not make changes to the data tables. Any ideas?
data("mtcars") # load data
setDT(mtcars) # convert to data table
mtcars[gear==5, gear :=NA] # create NA values for the purpose of my application
mtcars2 <- mtcars # create second DT
# Create function
computeWidth <- function(dataset){
dataset$gear[is.na(dataset$gear)] <- 0 # Convert NA to 0
dataset[ ,width := hp + gear] # create new variable
}
# Apply function
lapply(list(mtcars, mtcars2), computeWidth)
As you can see, the function works fin, but it didn't modify the data tables. ny thoughts on this ?
Upvotes: 4
Views: 2020
Reputation: 92282
Your main problem is that you are using incorrect syntax. Instead of dataset$gear[is.na(dataset$gear)] <- 0
you should be using dataset[is.na(gear), gear := 0]
, this way :=
will modify your original data set outside of the lexical scope of lapply
(<-
only operates within the lexical scope of a certain function). Thus modifying your function to
computeWidth <- function(dataset){
dataset[is.na(gear), gear := 0] # Convert NA to 0
dataset[ ,width := hp + gear] # create new variable
}
and then running
lapply(list(mtcars, mtcars2), computeWidth)
Will modify the original data sets.
As a side note, if you want to generalize this to many data.table
objects, you could look into the tables
function and try something as the following
lapply(mget(tables(silent = TRUE)$NAME), computeWidth)
Though it is always best to keep many objects in a single list in the first place instead of filling your global environment with many objects.
A very important note (suggested by @Frank), you should be aware that when using <-
on unmodified data.table
you are actually not creating a new object
mtcars2 <- mtcars
tracemem(mtcars)
## [1] "<00000000129264F8>"
tracemem(mtcars2)
## [1] "<00000000129264F8>"
Thus, by only modifying mtcars
you will also modify mtcars2
. Instead, the correct practice is to use copy
as in
mtcars2 <- copy(mtcars)
tracemem(mtcars)
## [1] "<00000000129264F8>"
tracemem(mtcars2)
## [1] "<000000001315F6B8>"
See here for further details.
Upvotes: 6