nebula026
nebula026

Reputation: 11

R: Summarize Data by Month and Year (Similar to pivot table)

I am trying to summarize data in R by month and year. I am using the ddply function to summarize the data but I want to change its look and doing a normal transpose doesn't give me the desired results. I am loading a csv file with daily river bypass data. The data has the following fields: date, year, month, day and bypass. I use the following code to summarize my file:

summary<- ddply(file,c("Year", "Month"), summarise, Sum =  round(sum(Bypass*1.9835),0)) 
summary

The output looks like:

Year Month   Sum
1946    10  1791
1946    11  1575
1946    12  1129
1947     1   823
1947     2   750
1947     3  1023

(And this goes on for ~61 years of data)

So to my question... Is there a way to transform the data to output in the following way:

                             Month
Year    1   2   3    4   5   6   7   8   9  10      11      12
1946                                        1791    1575    1129
1947    823 750 1023

I only copied in a sample of the data but it goes through 2007.

Thanks in advance

Upvotes: 0

Views: 2079

Answers (1)

mpalanco
mpalanco

Reputation: 13580

library(reshape2)
dcast(df, iYear ~ Month, value.var='Sum')

Output:

  iYear   1   2    3   10   11   12
1  1946  NA  NA   NA 1791 1575 1129
2  1947 823 750 1023   NA   NA   NA

If you want to replace the NAs with zeros:

df1 <- dcast(df, iYear ~ Month, value.var='Sum')
df1[is.na(df1)] <- 0

  iYear   1   2    3   10   11   12
1  1946   0   0    0 1791 1575 1129
2  1947 823 750 1023    0    0    0

Data:

df <- structure(list(iYear = c(1946L, 1946L, 1946L, 1947L, 1947L, 1947L
), Month = c(10L, 11L, 12L, 1L, 2L, 3L), Sum = c(1791L, 1575L, 
1129L, 823L, 750L, 1023L)), .Names = c("iYear", "Month", "Sum"
), class = "data.frame", row.names = c(NA, -6L))

  iYear Month  Sum
1  1946    10 1791
2  1946    11 1575
3  1946    12 1129
4  1947     1  823
5  1947     2  750
6  1947     3 1023

Upvotes: 2

Related Questions