Dan
Dan

Reputation: 2675

data.table grouping with variable names

I'm attempting to create a summarised data.table from an existing one, however I am wanting to do this in a function that allows me to pass in a column prefix so I can prefix my columns as required.

I've seen the question/response here but am trying to work out how to do it when not using the := operator.

Reprex:

library(data.table)
tbl1 <- data.table(urn = c("a", "a", "a", "b", "b", "b"),
           amount = c(1, 2, 1, 3, 3, 4))

#    urn amount
# 1:   a      1
# 2:   a      2
# 3:   a      1
# 4:   b      3
# 5:   b      3
# 6:   b      4

tbl2 <- tbl1[, .(mean_amt = mean(amount),
                 rows = .N),
             by = urn]

#    urn mean_amt rows
# 1:   a 1.333333    3
# 2:   b 3.333333    3

This is using fixed names for the column names being created, however as mentioned I'd like to be able to include a prefix.

I've tried the following:

prefix <- "mypfx_"
tbl2 <- tbl1[, .(paste0(prefix, mean_amt) = mean(amount),
                 paste0(prefix, rows) = .N),
             by = urn]

# Desired output
#    urn mypfx_mean_amt mypfx_rows
# 1:   a       1.333333          3
# 2:   b       3.333333          3

Unfortunately that codes gets an error saying: Error: unexpected '=' in " tbl2 <- tbl1[, .(paste0(prefix, mean_amt) ="

Any thoughts on how to make the above work would be appreciated.

Upvotes: 3

Views: 395

Answers (1)

akuiper
akuiper

Reputation: 215117

You can use setNames to rename the columns dynamically:

prefix <- "mypfx_"
tbl2 <- tbl1[, setNames(list(mean(amount), .N), paste0(prefix, c("mean_amt", "rows"))), 
               by = urn]

tbl2
#   urn mypfx_mean_amt mypfx_rows
#1:   a       1.333333          3
#2:   b       3.333333          3

Upvotes: 3

Related Questions