Reputation: 173
I have the following dataset:
ireland england france year
5 3 2 1920
4 3 4 1921
6 2 1 1922
3 1 5 1930
2 5 2 1931
I need to summarise the data by 1920's and 1930's. So I need total points for ireland, england and france in the 1920-1922 and then another total point for ireland,england and france in 1930,1931.
Any ideas? I have tried but failed.
x <- read.table(text = "ireland england france
5 3 2 1920
4 3 4 1921
6 2 1 1922
3 1 5 1930
2 5 2 1931", header = T)
Upvotes: 5
Views: 4989
Reputation: 717
As A5C1D2H2I1M1N2O1R2T1 mentioned, you can use findIntervals()
to set corresponding decade for each year and then, an aggregate()
to group py decade
txt <-
"ireland england france year
5 3 2 1920
4 3 4 1921
6 2 1 1922
3 1 5 1930
2 5 2 1931"
df <- read.table(text=txt, header=T)
decades <- c(1920, 1930, 1940)
df$decade<- decades[findInterval(df$year, decades)]
aggregate(cbind(ireland,england,france) ~ decade , data = df, sum)
Output:
decade ireland england france
1 1920 15 8 7
2 1930 5 6 7
Upvotes: 0
Reputation: 10360
How about dividing the years by 10 and then summarizing?
library(dplyr)
x %>% mutate(decade = floor(year/10)*10) %>%
group_by(decade) %>%
summarize_all(sum) %>%
select(-year)
# A tibble: 2 x 5
# decade ireland england france
# <dbl> <int> <int> <int>
# 1 1920 15 8 7
# 2 1930 5 6 7
Upvotes: 7