user1412
user1412

Reputation: 729

r data.table usage of .SD with multiple column sets to get RMSPE

I have a data set where I have done some predictions. I now want to calculate the RMSPE and for this I am using the MLmetrics package as I understand by input of predicted and actual I would get the RMSPE. How ever I am confused how I can use this within data.table to pass 2 sets of columns.

My sample data set would look like something as below -

library(data.table)
library(MLmetrics)

set.seed(123)
id <- seq(1001,1100,1)
city <- sample(1:4,100,replace = T)
a1 <- sample(1:100,100,replace = T)
a2 <- sample(1:100,100,replace = T)
a3 <- sample(1:100,100,replace = T)
a4 <- sample(1:100,100,replace = T)
a5 <- sample(1:100,100,replace = T)
p1 <- sample(1:100,100,replace = T)
p2 <- sample(1:100,100,replace = T)
p3 <- sample(1:100,100,replace = T)
p4 <- sample(1:100,100,replace = T)
p5 <- sample(1:100,100,replace = T)

df1 <- as.data.table(data.frame(id,city,a1,a2,a3,a4,a5,p1,p2,p3,p4,p5))

RMSPE <- df1[, lapply(.SD, function(x,y) RMSPE(x,y),
                       by = city, .SDcols = **xxxx**)] 

So in this case a1,a2,a3,a4,a5 are my actual values and p1,p2,p3,p4,p5 are my predicted values.I want to pass p1,p2,p3,p4,p5 as x and a1,a2,a3,a4,a5 as y. The resulting output that I am expecting is a kind of summary table with 4 rows ( one for each city) and 6 columns, 1st one for city, column 2-6 for RMSPE for each variable.

How can I get this in data.table. What should I replace xxxx with?

Thank you !!

Upvotes: 1

Views: 214

Answers (1)

fidelin
fidelin

Reputation: 320

I'm not sure if this is what you looking for

colsToKeep <- c("a1", "a2", "a3", "a4", "a5")
colsToW <- c("p1", "p2", "p3", "p4", "p5")


df1[, Map(function(x,y, w) get(x)(y, w), 
          setNames(rep('RMSPE',length(colsToKeep)), paste("RMSPE", colsToKeep, colsToW, sep = "_")),
          .SD[, ..colsToKeep], .SD[, ..colsToW]),
    by = city]  

Upvotes: 2

Related Questions