Reputation: 330
Here's a quick example of what i need:
data <- data.table(x = c(10, -10, 20, -20, 100, -100),
y = c(10, -10, 120, -120, 100, -100),
z = c(10, -10, 100, -100, 30, -30))
data[x > 90 | x < -90, x:=0]
data[z > 90 | z < -90, z:=0]
This works fine, but I'm looking to replace the last two rows with a simpler version, because there's a lot of columns in my actual data (about 200, out of which i need to update about 50) and a lot of rows (about 2 million, so the data is somewhat heavy)
Apologies if this is a duplicate
Upvotes: 3
Views: 1488
Reputation: 11
You can update using update for reference in Functional form:
data[z > 90 | z < -90 , ':=' (x=0, z=0)]
You can check more info: https://www.rdocumentation.org/packages/data.table/versions/1.12.8/topics/%3A%3D
Upvotes: 1
Reputation: 132969
Probably easiest and most efficient with eval(parse(...))
:
cols <- c("x", "z")
for (col in cols)
eval(parse(text = sprintf("data[%s > 90 | %s < -90, %s := 0][]", col, col, col)))
# x y z
#1: 10 10 10
#2: -10 -10 -10
#3: 20 120 0
#4: -20 -120 0
#5: 0 100 30
#6: 0 -100 -30
However, I would probably melt the data.table:
data <- melt(data)
cols <- c("x", "z")
data[variable %in% cols & (value > 90 | value < -90), value := 0]
# variable value
# 1: x 10
# 2: x -10
# 3: x 20
# 4: x -20
# 5: x 0
# 6: x 0
# 7: y 10
# 8: y -10
# 9: y 120
#10: y -120
#11: y 100
#12: y -100
#13: z 10
#14: z -10
#15: z 0
#16: z 0
#17: z 30
#18: z -30
Upvotes: 5
Reputation: 887901
We can use set
from data.table
for(j in names(data)[c(1,3)]){
set(data, i = which(data[[j]]>90 |data[[j]] < -90), j = j, value = 0)
}
data
# x y z
#1: 10 10 10
#2: -10 -10 -10
#3: 20 120 0
#4: -20 -120 0
#5: 0 100 30
#6: 0 -100 -30
Upvotes: 2