Bg1850
Bg1850

Reputation: 3082

How to create a data frame with the given data in row by row manner

I have a data frame which looks like this

  Customer     Mon1     Mon2     Mon3    Mon4     Mon5     Mon6     Mon7     Mon8     Mon9    Mon10    Mon11    Mon12
1  BS:100021  83.6140  76.7849  71.8369 66.8452  66.9263  53.5129  48.0321  44.5750  34.4080  40.8653  40.6369  24.8658
2  BS:100022  -0.5097   0.9198  -1.6027 -0.7160 -40.7443 -40.8863 -40.7049 -40.4623 -48.8805 -44.8879 -48.3559 -39.8656
3  BS:100025 106.4243 102.3998 100.9183 99.1006  98.1092  95.8125  94.2770  95.9911  92.1445  94.0984  87.7465  86.1946
4  BS:100037  37.5871  37.5888  37.5905 37.5924  37.5941  37.5957  37.5977  37.5993  37.5797  50.8395  50.8416  37.6064
5  BS:100050   0.0000   0.0000   0.0000  0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000
6  BS:100056 780.4214  88.0918  88.9721 88.8143  90.9508  97.9842  96.6309 101.7312  84.7743  76.5239 133.6655  86.3668
7  BS:100063  15.1694  15.1993  17.6528 19.6854  23.9929  27.5048  18.1503  19.8184  17.3152  17.3084  18.4588  24.0067
8  BS:100079   0.0292   0.0827   0.3120  0.1206   1.6245   2.3239   2.5857   0.1718   0.4340   0.6849   3.2916   2.2456
9  BS:100089   0.0000   0.0000   0.0000  0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000
10 BS:100091   0.1324   2.8137   0.9854  0.2405   0.2811   0.1312   0.0174   2.4304   0.7994   0.5884   0.2618   0.1233

Now I want to create another data frame for each customer for next twelve months I am at a loss how to create another data frame with next 12 months data .For e.g I have tried

make_new_data <- function(x){
  require(forecast)
  ts_object <- as.numeric(x[-1])
  forecasted_data <- data.frame(naive(ts_object,12))[,1]
  new_data <- c(x[1],as.character(forecasted_data))
  return(new_data)
}
z <- apply(test_data_entry,1,make_new_data)

This is obviously not working as c(x[1],as.character(forecasted_data)) does not make a vector such as .

1 BS:100021 83.614 76.7849 71.8369 66.8452 66.9263 53.5129 48.0321 44.575 34.408 40.8653 40.6369 24.8658

How do I get this to work ,secondly is there a way to do it in a faster way

Upvotes: 0

Views: 144

Answers (1)

David
David

Reputation: 10152

You have the data in the wide-format, a fast (I would also say nicer) way is to use the long-format. My approach uses data.table.

First, you need to transform your data from wide to long, using the melt-function of the reshape2-package. Given what your data looks like, it should be something like this (assuming your data is called test_data_entry):

library(data.table)
library(reshape2)
dt.wide <- as.data.table(test_data_entry)
dt.long <- melt(dt.wide, id.vars = "Customer")
# doing some minor changes
dt <- dt.long[, .(Customer, 
       Month = as.numeric(gsub("Mon", "", variable)),
       value)] # replace the MonX with X etc.

dt[, forecasted.value := as.data.frame(naive(value, 12))[,1], 
   by = Customer]
dt[order(Customer)]
#    Customer  Month   value forecasted.value
# 1: BS:100021     1 83.6140          24.8658
# 2: BS:100021     2 76.7849          24.8658
# 3: BS:100021     3 71.8369          24.8658
# 4: BS:100021     4 66.8452          24.8658
# 5: BS:100021     5 66.9263          24.8658
# ---                                         
# 116: BS:100091     8  2.4304           0.1233
# 117: BS:100091     9  0.7994           0.1233
# 118: BS:100091    10  0.5884           0.1233
# 119: BS:100091    11  0.2618           0.1233
# 120: BS:100091    12  0.1233           0.1233

Alternatively, stick to your approach:

You are right, the as.character messes around. This should fix it:

library(forecast)

make_new_data <- function(x){
  ts_object <- as.numeric(x[-1])
  forecasted_data <- data.frame(naive(ts_object, 12))[, 1]

  new_data <- data.frame(as.character(x[1]), t(forecasted_data))
  names(new_data) <- c("Customer", paste0("Mon", 1:12))

  return(new_data)
}
z <- data.table::rbindlist(apply(test_data_entry,1,make_new_data))
z
# Customer     Mon1     Mon2     Mon3     Mon4     Mon5     Mon6         Mon7     Mon8     Mon9
# 1: BS:100021  24.8658  24.8658  24.8658  24.8658  24.8658  24.8658  24.8658  24.8658  24.8658
# 2: BS:100022 -39.8656 -39.8656 -39.8656 -39.8656 -39.8656 -39.8656 -39.8656 -39.8656 -39.8656
# 3: BS:100025  86.1946  86.1946  86.1946  86.1946  86.1946  86.1946  86.1946  86.1946  86.1946
# 4: BS:100037  37.6064  37.6064  37.6064  37.6064  37.6064  37.6064  37.6064  37.6064  37.6064
# 5: BS:100050   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000
# 6: BS:100056  86.3668  86.3668  86.3668  86.3668  86.3668  86.3668  86.3668  86.3668  86.3668
# 7: BS:100063  24.0067  24.0067  24.0067  24.0067  24.0067  24.0067  24.0067  24.0067  24.0067
# 8: BS:100079   2.2456   2.2456   2.2456   2.2456   2.2456   2.2456   2.2456   2.2456   2.2456
# 9: BS:100089   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000
# 10: BS:100091   0.1233   0.1233   0.1233   0.1233   0.1233   0.1233   0.1233   0.1233   0.1233
# Mon10    Mon11    Mon12
# 1:  24.8658  24.8658  24.8658
# 2: -39.8656 -39.8656 -39.8656
# 3:  86.1946  86.1946  86.1946
# 4:  37.6064  37.6064  37.6064
# 5:   0.0000   0.0000   0.0000
# 6:  86.3668  86.3668  86.3668
# 7:  24.0067  24.0067  24.0067
# 8:   2.2456   2.2456   2.2456
# 9:   0.0000   0.0000   0.0000
# 10:   0.1233   0.1233   0.1233

Upvotes: 2

Related Questions