geekzeus
geekzeus

Reputation: 895

Display Factor Variable Data into Table

This is the small sampleData of half million rows,here product,date and salesamount is a factor variable. I need to divide the data by year 2011 and 2012 and display the product with its total sum of the sales amount in that year. I tried a lot at last i did it in excel using pivot table. but i want know that how this can be achived in R, as i don't want more excel files. Thankyou

 product    date       salesamount
   a       2011-01          35
   b       2011-02          40
   c       2011-03          50
   d       2011-01          55
   b       2012-02          46
   a       2011-02          35
   d       2012-01          62
   c       2012-03          56
   c       2012-02          56
   a       2012-03          38
   b       2012-01          46
   e       2011-03          60
   a       2012-03          38
   e       2012-02          67
   d       2011-01          55

Data Should Look something like this after dividing it by year

 product   year2011    year2012
   a         70         76
   b         40         92
   c         50         112
   d         110        62
   e         60         67

Upvotes: 1

Views: 67

Answers (3)

IRTFM
IRTFM

Reputation: 263391

Assuming the dat object is dat then the base tapply approach would be:

with( dat, tapply( salesamount, list( product, substring(date, 1,4) ), sum) )
  2011 2012
a   70   76
b   40   92
c   50  112
d  110   62
e   60   67

Upvotes: 0

jogo
jogo

Reputation: 12569

Here is a solution with data.table:

library("data.table")
DT[, year:=substr(date, 1, 4)]
dcast(DT, product ~ year, value.var="salesamount", fun.aggregate = sum)
# > dcast(DT, product ~ year, value.var="salesamount", fun.aggregate = sum)
#    product 2011 2012
# 1:       a   70   76
# 2:       b   40   92
# 3:       c   50  112
# 4:       d  110   62
# 5:       e   60   67

Data:

DT <- fread(
" product    date       salesamount
   a       2011-01          35
   b       2011-02          40
   c       2011-03          50
   d       2011-01          55
   b       2012-02          46
   a       2011-02          35
   d       2012-01          62
   c       2012-03          56
   c       2012-02          56
   a       2012-03          38
   b       2012-01          46
   e       2011-03          60
   a       2012-03          38
   e       2012-02          67
   d       2011-01          55")

Upvotes: 1

Axeman
Axeman

Reputation: 35352

  1. Create a year variable.
  2. Group and sum.
  3. Reshape long to wide.

.

library(dplyr)
library(tidyr)

df %>% 
  separate(date, c('year', 'month')) %>% 
  group_by(product, year) %>% 
  summarise(salesamount = sum(salesamount)) %>% 
  spread(year, salesamount, sep = '')
# A tibble: 5 x 3
# Groups:   product [5]
  product year2011 year2012
  <chr>      <int>    <int>
1 a             70       76
2 b             40       92
3 c             50      112
4 d            110       62
5 e             60       67

Upvotes: 2

Related Questions