Reputation: 3082
I have a data frame which looks like this
Customer Mon1 Mon2 Mon3 Mon4 Mon5 Mon6 Mon7 Mon8 Mon9 Mon10 Mon11 Mon12
1 BS:100021 83.6140 76.7849 71.8369 66.8452 66.9263 53.5129 48.0321 44.5750 34.4080 40.8653 40.6369 24.8658
2 BS:100022 -0.5097 0.9198 -1.6027 -0.7160 -40.7443 -40.8863 -40.7049 -40.4623 -48.8805 -44.8879 -48.3559 -39.8656
3 BS:100025 106.4243 102.3998 100.9183 99.1006 98.1092 95.8125 94.2770 95.9911 92.1445 94.0984 87.7465 86.1946
4 BS:100037 37.5871 37.5888 37.5905 37.5924 37.5941 37.5957 37.5977 37.5993 37.5797 50.8395 50.8416 37.6064
5 BS:100050 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
6 BS:100056 780.4214 88.0918 88.9721 88.8143 90.9508 97.9842 96.6309 101.7312 84.7743 76.5239 133.6655 86.3668
7 BS:100063 15.1694 15.1993 17.6528 19.6854 23.9929 27.5048 18.1503 19.8184 17.3152 17.3084 18.4588 24.0067
8 BS:100079 0.0292 0.0827 0.3120 0.1206 1.6245 2.3239 2.5857 0.1718 0.4340 0.6849 3.2916 2.2456
9 BS:100089 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
10 BS:100091 0.1324 2.8137 0.9854 0.2405 0.2811 0.1312 0.0174 2.4304 0.7994 0.5884 0.2618 0.1233
Now I want to create another data frame for each customer for next twelve months I am at a loss how to create another data frame with next 12 months data .For e.g I have tried
make_new_data <- function(x){
require(forecast)
ts_object <- as.numeric(x[-1])
forecasted_data <- data.frame(naive(ts_object,12))[,1]
new_data <- c(x[1],as.character(forecasted_data))
return(new_data)
}
z <- apply(test_data_entry,1,make_new_data)
This is obviously not working as c(x[1],as.character(forecasted_data))
does not make a vector such as .
1 BS:100021 83.614 76.7849 71.8369 66.8452 66.9263 53.5129 48.0321 44.575 34.408 40.8653 40.6369 24.8658
How do I get this to work ,secondly is there a way to do it in a faster way
Upvotes: 0
Views: 144
Reputation: 10152
You have the data in the wide-format, a fast (I would also say nicer) way is to use the long-format. My approach uses data.table.
First, you need to transform your data from wide to long, using the melt
-function of the reshape2
-package. Given what your data looks like, it should be something like this (assuming your data is called test_data_entry
):
library(data.table)
library(reshape2)
dt.wide <- as.data.table(test_data_entry)
dt.long <- melt(dt.wide, id.vars = "Customer")
# doing some minor changes
dt <- dt.long[, .(Customer,
Month = as.numeric(gsub("Mon", "", variable)),
value)] # replace the MonX with X etc.
dt[, forecasted.value := as.data.frame(naive(value, 12))[,1],
by = Customer]
dt[order(Customer)]
# Customer Month value forecasted.value
# 1: BS:100021 1 83.6140 24.8658
# 2: BS:100021 2 76.7849 24.8658
# 3: BS:100021 3 71.8369 24.8658
# 4: BS:100021 4 66.8452 24.8658
# 5: BS:100021 5 66.9263 24.8658
# ---
# 116: BS:100091 8 2.4304 0.1233
# 117: BS:100091 9 0.7994 0.1233
# 118: BS:100091 10 0.5884 0.1233
# 119: BS:100091 11 0.2618 0.1233
# 120: BS:100091 12 0.1233 0.1233
You are right, the as.character
messes around. This should fix it:
library(forecast)
make_new_data <- function(x){
ts_object <- as.numeric(x[-1])
forecasted_data <- data.frame(naive(ts_object, 12))[, 1]
new_data <- data.frame(as.character(x[1]), t(forecasted_data))
names(new_data) <- c("Customer", paste0("Mon", 1:12))
return(new_data)
}
z <- data.table::rbindlist(apply(test_data_entry,1,make_new_data))
z
# Customer Mon1 Mon2 Mon3 Mon4 Mon5 Mon6 Mon7 Mon8 Mon9
# 1: BS:100021 24.8658 24.8658 24.8658 24.8658 24.8658 24.8658 24.8658 24.8658 24.8658
# 2: BS:100022 -39.8656 -39.8656 -39.8656 -39.8656 -39.8656 -39.8656 -39.8656 -39.8656 -39.8656
# 3: BS:100025 86.1946 86.1946 86.1946 86.1946 86.1946 86.1946 86.1946 86.1946 86.1946
# 4: BS:100037 37.6064 37.6064 37.6064 37.6064 37.6064 37.6064 37.6064 37.6064 37.6064
# 5: BS:100050 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
# 6: BS:100056 86.3668 86.3668 86.3668 86.3668 86.3668 86.3668 86.3668 86.3668 86.3668
# 7: BS:100063 24.0067 24.0067 24.0067 24.0067 24.0067 24.0067 24.0067 24.0067 24.0067
# 8: BS:100079 2.2456 2.2456 2.2456 2.2456 2.2456 2.2456 2.2456 2.2456 2.2456
# 9: BS:100089 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
# 10: BS:100091 0.1233 0.1233 0.1233 0.1233 0.1233 0.1233 0.1233 0.1233 0.1233
# Mon10 Mon11 Mon12
# 1: 24.8658 24.8658 24.8658
# 2: -39.8656 -39.8656 -39.8656
# 3: 86.1946 86.1946 86.1946
# 4: 37.6064 37.6064 37.6064
# 5: 0.0000 0.0000 0.0000
# 6: 86.3668 86.3668 86.3668
# 7: 24.0067 24.0067 24.0067
# 8: 2.2456 2.2456 2.2456
# 9: 0.0000 0.0000 0.0000
# 10: 0.1233 0.1233 0.1233
Upvotes: 2