Reputation: 2481
A data.table
novice question.
I would like to transform a set of columns in a data.table
by applying a mathematical formula to them. The set of columns must exclude 1 or more of the total number of columns.
In data.frame
terms I would do:
data(iris)
head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
iris[, -5] <- iris[, -5] * 1e3
head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5100 3500 1400 200 setosa
2 4900 3000 1400 200 setosa
3 4700 3200 1300 200 setosa
4 4600 3100 1500 200 setosa
5 5000 3600 1400 200 setosa
6 5400 3900 1700 400 setosa
I know how to select multiple columns in a data.table
:
iris.dt <- data.table(iris)
head(iris.dt[, -5, with = FALSE])
or even:
head(iris.dt[, !"Species", with = FALSE])
How to actually transform those selected columns taking advantage of data.table
pass-by-reference?
Upvotes: 12
Views: 4704
Reputation: 61
.SDcols
is the right approach, but you can specify the column names just once using a vector.
DT <- data.table(iris)
colnms <- c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")
DT[, (colnms) := lapply(.SD, function(x) x*1000), .SDcols = colnms]
Note that you need the parentheses to the left of :=
to stop data.table
interpreting colnms
as the name of a column.
Upvotes: 6
Reputation: 193507
What about using the .SDCols
argument along with assignment by reference (:=
):
DT <- data.table(iris)
DT[, c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")
:=lapply(.SD, function(x) x*1000), .SDcols=1:4]
# Alternatively you can grab the names the usual way:
# DT[, names(DT)[1:4] := lapply(.SD, function(x) x*1000), .SDcols=1:4]
DT
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1: 5100 3500 1400 200 setosa
# 2: 4900 3000 1400 200 setosa
# 3: 4700 3200 1300 200 setosa
# 4: 4600 3100 1500 200 setosa
# 5: 5000 3600 1400 200 setosa
# ---
# 146: 6700 3000 5200 2300 virginica
# 147: 6300 2500 5000 1900 virginica
# 148: 6500 3000 5200 2000 virginica
# 149: 6200 3400 5400 2300 virginica
# 150: 5900 3000 5100 1800 virginica
Upvotes: 14