Náthali
Náthali

Reputation: 937

R: Aggregate Data Table

How can I do this in a fastest way, because I have many values in columns

columns <- levels(dt$AER)

dt[, (columns) := 0]

dt[AER=="AMS", AMS :=1]

dt[AER=="BLL", BLL :=1]

dt[AER=="BLQ", BLQ :=1]

dt[AER=="BRE", BRE :=1]

Upvotes: 1

Views: 183

Answers (4)

Colonel Beauvel
Colonel Beauvel

Reputation: 31161

You can do:

cols = unique(dt[,AER])
colsWith1 = c('AMS','BLL','BLQ','BRE')

dt[ , (cols) := lapply(cols, function(u) as.integer(AER %chin% intersect(colsWith1, u))), .SDcols = cols]
#   AER ZZZ AMS BLL BLQ BRE
#1: ZZZ   0   0   0   0   0
#2: AMS   0   1   0   0   0
#3: AMS   0   1   0   0   0
#4: BLL   0   0   1   0   0
#5: BLQ   0   0   0   1   0
#6: BLQ   0   0   0   1   0
#7: BRE   0   0   0   0   1

Data:

dt = data.table(
  AER = c("ZZZ", "AMS", "AMS", "BLL", "BLQ", "BLQ", "BRE"), 
  ZZZ = c(0L, 0L, 0L, 0L, 0L, 0L, 0L), 
  AMS = c(0L, 1L, 1L, 0L, 0L, 0L, 0L), 
  BLL = c(0L, 0L, 0L, 1L, 0L, 0L, 0L), 
  BLQ = c(0L, 0L, 0L, 0L, 1L, 1L, 0L), 
  BRE = c(0L, 0L, 0L, 0L, 0L, 0L, 1L)
)

Upvotes: 1

jangorecki
jangorecki

Reputation: 16697

The fastest way is to use binary search

library(data.table)
dt <- data.table(AER = c("AMS","BLL","BLQ","BRE"), AMS = 0, BLL = 0, BLQ = 0, BRE = 0)
setkey(dt,AER)
dt["AMS", AMS :=1]
dt["BLL", BLL :=1]
dt["BLQ", BLQ :=1]
dt["BRE", BRE :=1]

If you use 1.9.5 you don't even need to do that, it uses auto.index feature and after first filter AER=="AMS" it will build an index behind the scene and reuse it in each next filtering - even using == operator - all behind the scene. You can set options("datatable.verbose"=TRUE) to see when and how it works. So if you are on 1.9.5 just turn the verbose=TRUE and try your own code.

Upvotes: 2

eddi
eddi

Reputation: 49448

dt = data.table(AER = c("ZZZ", "AMS", "AMS", "BLL", "BLQ", "BLQ", 
"BRE"))

dt[, idx := .I]

# use dcast.data.table before 1.9.5
dcast(dt, idx + AER ~ AER, fill = 0, fun = function(x) 1)
#   idx AER AMS BLL BLQ BRE ZZZ
#1:   1 ZZZ   0   0   0   0   1
#2:   2 AMS   1   0   0   0   0
#3:   3 AMS   1   0   0   0   0
#4:   4 BLL   0   1   0   0   0
#5:   5 BLQ   0   0   1   0   0
#6:   6 BLQ   0   0   1   0   0
#7:   7 BRE   0   0   0   1   0

Upvotes: 3

Prolix
Prolix

Reputation: 284

What you wish to achieve seems similar to this post. Adapting @Sven Hohenstein 's answer to you notation, it would become:

cbind(dt, model.matrix( ~ 0 + AER, dt))

Upvotes: 1

Related Questions