user11815263
user11815263

Reputation: 1

Group data by year and summarise the value

Date    StockPrice
1970-01-01  27.85380952
1970-02-01  22.38105263
1970-03-01  23.1052381
1970-04-01  21.57136364
1970-05-01  18.93285714
1970-06-01  15.44318182
1970-07-01  13.65590909
1970-08-01  13.3352381
1970-09-01  17.38952381
1970-10-01  15.90954545
1970-11-01  13.9705
1970-12-01  14.35454545
1971-01-01  17.3425
1971-02-01  21.85894737
1971-03-01  22.06347826
1971-04-01  22.31904762
1971-05-01  23.5165
1971-06-01  21.23045455
1971-07-01  18.21714286
1971-08-01  16.165
1971-09-01  15.51904762
1971-10-01  16.26190476
1971-11-01  14.8
1971-12-01  18.00363636
1972-01-01  23.1347619
1972-02-01  25.2665
1972-03-01  23.28227273
1972-04-01  23.7355
1972-05-01  21.64454545
1972-06-01  21.475
1972-07-01  22.116
1972-08-01  22.52391304
1972-09-01  22.4165
1972-10-01  22.1
1972-11-01  23.285
1972-12-01  25.42473684
1973-01-01  24.80761905
1973-02-01  22.70210526
1973-03-01  21.43545455
1973-04-01  19.4285
1973-05-01  19.03136364
1973-06-01  17.3552381
1973-07-01  18.49142857
1973-08-01  18.05130435
1973-09-01  18.26789474
1973-10-01  19.20956522
1973-11-01  17.14904762
1973-12-01  13.091

I tried to group the data by year and sum the value but am not able to get the sum instead all the year that is grouped by gives the same total value

sum_Coca<-DT[, list(Tot = sum(Cocacola$StockPrice)),by = year(date)]

transform(sum_Coca, new.col=c(NA,sum_Coca$StockPrice[-1]/sum_Coca$StockPrice[-nrow(sum_Coca)]-1))

i am expecting the result as

Row Labels  Sum of StockPrice
1970    217.9027648
1971    227.2976594
1972    276.40473
1973    229.0205211

but am getting result as

Row Labels  Sum of StockPrice
1970    950.625672
1971    950.625672
1972    950.625672
1973    950.625672

Upvotes: 0

Views: 54

Answers (2)

OTStats
OTStats

Reputation: 1868

A solution using dplyr and lubridate would look like this:

library(dplyr)
library(lubridate)

df %>% 
  group_by(year(Date)) %>% 
  summarise(total_stock = sum(StockPrice))

  ## A tibble: 4 x 2
  # `year(Date)` total_stock
  #         <dbl>       <dbl>
  #1         1970        218.
  #2         1971        227.
  #3         1972        276.
  #4         1973        229.

Or you could clean up labels by creating a new Year field:

df %>% 
  mutate(Year = year(Date)) %>% 
  group_by(Year) %>% 
  summarise(total_stock = sum(StockPrice))

## A tibble: 4 x 2
#   Year total_stock
#  <dbl>       <dbl>
#1  1970        218.
#2  1971        227.
#3  1972        276.
#4  1973        229.

Upvotes: 1

akrun
akrun

Reputation: 887078

The issue is that we are extracting the whole column instead of the values of the column respecting each group. Not clear whether "Cocacola" dataset is the same as 'DT'. If it is, then remove the "Cocacola$"

library(data.table)
DT[, list(Tot = sum(StockPrice)),by = .(Year = year(date))]

Upvotes: 1

Related Questions