drv
drv

Reputation: 63

grouping and splitting data frame in R

The following is the promotion sales table listing products and group where the promotion was run and at what time.

   Product.code  cgrp promo.from   promo.to
1    1100001369    12 2014-01-01 2014-03-01
2    1100001369 16 37 2014-01-01 2014-03-01
3    1100001448    12 2014-03-01 2014-03-01
4    1100001446    12 2014-03-01 2014-03-01
5    1100001629 11 30 2014-03-01 2014-03-01
6    1100001369 16 37 2014-03-01 2014-06-01
7    1100001368    12 2014-06-01 2014-07-01
8    1100001369    12 2014-06-01 2014-07-01
9    1100001368 11 30 2014-06-01 2014-07-01
10   1100001738 11 30 2014-06-01 2014-07-01
11   1100001629 11 30 2014-06-01 2014-06-01
12   1100001738 11 30 2014-07-01 2014-07-01
13   1100001619 11 30 2014-08-01 2014-08-01
14   1100001619 11 30 2014-08-01 2014-08-01
15   1100001629 11 30 2014-08-01 2014-08-01
16   1100001738    12 2014-09-01 2014-09-01
17   1100001738 16 37 2014-08-01 2014-08-01
18   1100001448    12 2014-09-01 2014-09-01
19   1100001446    12 2014-10-01 2014-10-01
20   1100001369    12 2014-11-01 2014-11-01
21   1100001547 16 37 2014-11-01 2014-11-01
22   1100001368 11 30 2014-11-01 2014-11-01

I am trying to group the product.code and cgrp so that I can know all promotion for a product in a particular group and do further analysis.

I tried looping through the whole data.frame. Not efficient and buggy.

What is the efficient method to get this done.

[edit] to get a multiple data.frame like the following

x=

   Product.code  cgrp promo.from   promo.to
3    1100001448    12 2014-03-01 2014-03-01
18   1100001448    12 2014-09-01 2014-09-01

y=

   Product.code  cgrp promo.from   promo.to
1    1100001369    12 2014-01-01 2014-03-01
8    1100001369    12 2014-06-01 2014-07-01
20   1100001369    12 2014-11-01 2014-11-01

Upvotes: 0

Views: 65

Answers (1)

akrun
akrun

Reputation: 886938

You could split the 'cgrp' column and reshape the dataset to 'long' format with cSplit. Then, split the dataset ('df1') by 'Product.code' and 'cgrp to create a list ('lst').

 library(splitstackshape)
 df1 <- as.data.frame(cSplit(df, 'cgrp', ' ', 'long'))
 lst <- split(df1, list(df1$Product.code, df1$cgrp), drop=TRUE)
 names(lst) <- paste0('dfN', seq_along(lst))

It may be better to keep the datasets in a list. But, if you want as separate objects in the global environment, one option is list2env (not recommended).

 list2env(lst, envir=.GlobalEnv)

Upvotes: 1

Related Questions