DJ-AFC
DJ-AFC

Reputation: 569

Calculate monthly sums for multiple columns

I have a data frame containing years in the first column, months in the second, days in the third, and then rainfall data for different cities (LON and NYC) in the fourth and fifth columns, as shown below:

mat1 = matrix(c(rep(1979, each=360), rep(1:12, each=30), rep(seq(1, 30, by=1), times=12), 
            rep(seq(5, 25, by=5), times=72), rep(seq(1, 9, by=1), times=40)), nrow=360, ncol=5)

colnames(mat1) = c("Year", "Month", "Day", "LON", "NYC")

I want to calculate monthly sums for each city. The output should take the form:

    LON NYC
Jan  x   x
Feb  x   x
Mar  x   x
................

I have tried:

aggregate(LON ~ Month + Year, mat1, sum)

but this does not output data in the format I want, and crucially it also only allows me to calculate for one city (column) at a time. How can I adapt the above to work for all cities and output in the desired format?

Upvotes: 1

Views: 222

Answers (3)

user2974951
user2974951

Reputation: 10385

Like this?

> aggregate(mat1[,4:5],list(mat1[,2]),sum)
   Group.1 LON NYC
1        1 450 141
2        2 450 150
3        3 450 159
4        4 450 141
5        5 450 150
6        6 450 159
7        7 450 141
8        8 450 150
9        9 450 159
10      10 450 141
11      11 450 150
12      12 450 159

Upvotes: 1

NelsonGon
NelsonGon

Reputation: 13319

Arguably complicating things but a dplyr-tidyr option. This uses the newer pivot_* functions in place of gather/spread. A purely dplyr option shown by @www is probably more efficient.

as.data.frame(mat1) %>% 
   pivot_longer(LON:NYC,names_to = "city")  %>% 
   group_by(Year,Month,city) %>% 
   summarise(Sum = sum(value)) %>% 
   pivot_wider(names_from = city,values_from = Sum)
# A tibble: 12 x 4
# Groups:   Year, Month [12]
    Year Month   LON   NYC
   <dbl> <dbl> <dbl> <dbl>
 1  1979     1   450   141
 2  1979     2   450   150
 3  1979     3   450   159
 4  1979     4   450   141
 5  1979     5   450   150
 6  1979     6   450   159
 7  1979     7   450   141
 8  1979     8   450   150
 9  1979     9   450   159
10  1979    10   450   141
11  1979    11   450   150
12  1979    12   450   159

Upvotes: 2

www
www

Reputation: 39174

A solution using dplyr. We need to convert the matrix to data frame before using the group_by and summarize_at function.

library(dplyr)
mat1_sum <- mat1 %>%
  as.data.frame() %>%
  group_by(Month) %>%
  summarize_at(vars(LON, NYC), sum)
mat1_sum
# # A tibble: 12 x 3
#    Month   LON   NYC
#    <dbl> <dbl> <dbl>
#  1     1   450   141
#  2     2   450   150
#  3     3   450   159
#  4     4   450   141
#  5     5   450   150
#  6     6   450   159
#  7     7   450   141
#  8     8   450   150
#  9     9   450   159
# 10    10   450   141
# 11    11   450   150
# 12    12   450   159

Upvotes: 2

Related Questions