Reputation: 647
I use the data.table package in R to summarize data often. In this particular case, I'm just counting the number of occurrences in a dataset for given column groups. But I'm having trouble incorporating a loop to do this dynamically.
Normally, I'd summarize data like this.
data <- data.table(mpg)
data.temp1 <- data[, .N, by="manufacturer,class"]
data.temp2 <- data[, .N, by="manufacturer,trans"]
But now I want to loop through the columns of interest in my dataset and plot. Rather than repeating the code over and over, I want to put it in a for loop. Something like this:
columns <- c('class', 'trans')
for (i in 1:length(columns)) {
data.temp <- data[, .N, by=list(manufacturer,columns[i])]
#plot data
}
If I only wanted the column of interest, I could do this in the loop and it works:
data.temp <- data[, .N, by=get(columns[i])]
But if I want to put in a static column name, like manufacturer, it breaks. I can't seem to figure out how to mix a static column name along with a dynamic one. I've looked around but can't find an answer.
Would appreciate any thoughts!
Upvotes: 1
Views: 104
Reputation: 206253
You should be fine if you just quote `"manufacturer"
data.temp <- data[, .N, by=c("manufacturer",columns[i])]
From the ?'[.data.table'
help page, by=
can be
A single unquoted column name, a list() of expressions of column names, a single character string containing comma separated column names (where spaces are significant since column names may contain spaces even at the start or end), or a character vector of column names.
This seems like the easiest way to give you what you need.
Upvotes: 5