chrisjohnh
chrisjohnh

Reputation: 35

Looping over columns in data.table R

I am trying to loop over columns in data.table package in R. I have been having trouble trying to get the for loop to accurately input the column when I subset the datatable.

The goal of my task is to get the number of rows for each subset of the datatable when the column condition of "==1" is met.

Here is my code:


data <- data.table(va=c(1,0,1), vb=c(1,0,0), vc=c(1,1,1))


names <- c("va", "vc")

for (col in names) {
    print(nrow(data[col == 1,]))
    print(col)
}

Here is the output I get

[1] 0
[1] "va"
[1] 0
[1] "vc"

Is there something I am missing or a better way of doing this?

Upvotes: 1

Views: 594

Answers (1)

dww
dww

Reputation: 31454

You can use colSums, which is much simpler and faster than looping.

dt <- data.table(va=c(1,0,1), vb=c(1,0,0), vc=c(1,1,1))
col.names <- c("va", "vc")
dt[, colSums(.SD==1), .SDcols = col.names]
# va vc 
# 2  3 

Note: I changed your object names to dt and col.names because it is not good practice to use base functions as names.

If you really want to use a for loop (I don't recommend it, but for educational purposes...) you can fix it using get to use the values of the column rather than the column name itself

for (col in col.names) {
  dt[get(col) == 1, print(.N)]
}

Upvotes: 1

Related Questions