Reputation: 11
I am trying to summarize data in R by month and year. I am using the ddply function to summarize the data but I want to change its look and doing a normal transpose doesn't give me the desired results. I am loading a csv file with daily river bypass data. The data has the following fields: date, year, month, day and bypass. I use the following code to summarize my file:
summary<- ddply(file,c("Year", "Month"), summarise, Sum = round(sum(Bypass*1.9835),0))
summary
The output looks like:
Year Month Sum
1946 10 1791
1946 11 1575
1946 12 1129
1947 1 823
1947 2 750
1947 3 1023
(And this goes on for ~61 years of data)
So to my question... Is there a way to transform the data to output in the following way:
Month
Year 1 2 3 4 5 6 7 8 9 10 11 12
1946 1791 1575 1129
1947 823 750 1023
I only copied in a sample of the data but it goes through 2007.
Thanks in advance
Upvotes: 0
Views: 2079
Reputation: 13580
library(reshape2)
dcast(df, iYear ~ Month, value.var='Sum')
Output:
iYear 1 2 3 10 11 12
1 1946 NA NA NA 1791 1575 1129
2 1947 823 750 1023 NA NA NA
If you want to replace the NAs with zeros:
df1 <- dcast(df, iYear ~ Month, value.var='Sum')
df1[is.na(df1)] <- 0
iYear 1 2 3 10 11 12
1 1946 0 0 0 1791 1575 1129
2 1947 823 750 1023 0 0 0
Data:
df <- structure(list(iYear = c(1946L, 1946L, 1946L, 1947L, 1947L, 1947L
), Month = c(10L, 11L, 12L, 1L, 2L, 3L), Sum = c(1791L, 1575L,
1129L, 823L, 750L, 1023L)), .Names = c("iYear", "Month", "Sum"
), class = "data.frame", row.names = c(NA, -6L))
iYear Month Sum
1 1946 10 1791
2 1946 11 1575
3 1946 12 1129
4 1947 1 823
5 1947 2 750
6 1947 3 1023
Upvotes: 2