Reputation: 2017
I am trying to turn the code below, which already works, into a function.
A similar situation, dcast + DT, has already been disscused here! But i've wasnt able to solve the problem like that.
What I want to achieve is:
This is the code that works already:
result1 <- dcast(setDT(data), customer_id ~ paste0("num_of_oranges",period), value.var = "num_of_oranges", sum)
result2 <- dcast(setDT(data), customer_id ~ paste0("num_of_oranges",period) + paste0("SIGN_",sign), value.var = "num_of_oranges", sum)
result3 <- dcast(setDT(data), customer_id ~ paste0("num_of_oranges",period) + paste0("SIGN_",sign) + paste0("ORIGIN_",origin), value.var = "num_of_oranges", sum)
My attempt towards the function:
create.Feature <- function(col1, stat) {
test1 <- dcast(df, df[[id]] ~ paste0("col1",df[[period]]), value.var = df[["col1"]], stat)
return(test1)
test2 <- dcast(df, df[[id]] ~ paste0("col1",df[[period]]) + paste0("SIGN",df[[sign]]), value.var = df[["col1"]], stat)
return(test2)
test3 <- dcast(df, df[[id]] ~ paste0("col1",df[[period]]) + paste0("SIGN",df[[sign]]) + paste0("ORIGIN",df[[origin]]), value.var = df[["col1"]], stat)
return(test3)
And the call:
test_result <- create.Feature("num_of_oranges", sum)
I get the following error: Error in .subset2(x, i, exact = exact) : no such index at level 1
Anyone?
Upvotes: 0
Views: 1193
Reputation: 178
I tried using the mtcars
dataset to reproduce your function.
Code:
cars <- mtcars
result1 <- dcast(setDT(cars), cyl ~ paste0("disp", gear),
value.var = "disp",
sum)
result2 <- dcast(setDT(cars), cyl ~ paste0("disp", gear) +
paste0("am", am),
value.var = "disp",
sum)
result3 <- dcast(setDT(cars), cyl ~ paste0("disp", gear) +
paste0("am", am) +
paste0("vs", vs),
value.var = "disp",
sum)
create.Feature <- function(df, id, col1) {
test1 <- dcast(df,
df[[id]] ~ paste0(col1, df[["gear"]]),
value.var = col1,
sum)
test2 <- dcast(df,
df[[id]] ~ paste0(col1, df[["gear"]]) +
paste0("am", df[["am"]]),
value.var = col1,
sum)
test3 <- dcast(df,
df[[id]] ~ paste0(col1, df[["gear"]]) +
paste0("am", df[["am"]]) +
paste0("vs", df[["vs"]]),
value.var = col1,
sum)
list(test1, test2, test3)
}
tr <- create.Feature(df = cars,
id = "cyl",
col1 = "disp")
Output:
tr
[[1]]
df disp3 disp4 disp5
1: 4 120.1 821.0 215.4
2: 6 483.0 655.2 145.0
3: 8 4291.4 0.0 652.0
[[2]]
df disp3_am0 disp4_am0 disp4_am1 disp5_am1
1: 4 120.1 287.5 533.5 215.4
2: 6 483.0 335.2 320.0 145.0
3: 8 4291.4 0.0 0.0 652.0
[[3]]
df disp3_am0_vs0 disp3_am0_vs1 disp4_am0_vs1 disp4_am1_vs0
1: 4 0.0 120.1 287.5 0
2: 6 0.0 483.0 335.2 320
3: 8 4291.4 0.0 0.0 0
disp4_am1_vs1 disp5_am1_vs0 disp5_am1_vs1
1: 533.5 120.3 95.1
2: 0.0 145.0 0.0
3: 0.0 652.0 0.0
A few points though:
df[[sign]]
and df[[origin]]
, which I did the same.stat
into the function, that's why I added sum
into the function instead of stat
. I can't figure out what is the problem. I tried match.fun()
and do.call
, just can't seem to get it to work.test3
was the last statement, I assumed you want all three test1
, test2
and test3
, so I combined them into a list and let that be the output (last statement). Not sure if this is what you want, if not, hope you'll get it soon. I personally don't use data.table
, I use more of dplyr
.
Upvotes: 1