Section_4
Section_4

Reputation: 41

R aggregating on date then character

I have a table that looks like the following:

Year    Country Variable 1  Variable 2
1970    UK            1       3
1970    USA           1       3
1971    UK            2       5
1971    UK            2       3
1971    UK            1       5
1971    USA           2       2
1972    USA           1       1
1972    USA           2       5

I'd be grateful if someone could tell me how I can aggregate the data to group it first by year, then country with the sum of variable 1 and variable 2 coming afterwards so the output would be:

Year    Country Sum Variable 1  Sum Variable 2
1970    UK              1           3
1970    USA             1           3
1971    UK              5           13
1971    USA             2           2
1972    USA             3           6

This is the code I've tried to no avail (the real dataframe is 125,000 rows by 30+ columns hence the subset. Please be kind, I'm new to R!)

#making subset from data
GT2 <- subset(GT1, select = c("iyear", "country_txt", "V1", "V2"))
#making sure data types are correct
GT2[,2]=as.character(GT2[,2])
GT2[,3] <- as.numeric(as.character( GT2[,3] ))
GT2[,4] <- as.numeric(as.character( GT2[,4] ))

#removing NA values
GT2Omit <- na.omit(GT2)

#trying to aggregate - i.e. group by year, then country with the sum of Variable 1 and Variable 2 being shown
aggGT2 <-aggregate(GT2Omit, by=list(GT2Omit$iyear, GT2Omit$country_txt), FUN=sum, na.rm=TRUE)

Upvotes: 1

Views: 276

Answers (2)

Neal Fultz
Neal Fultz

Reputation: 9687

Your aggregate is almost correct:

> aggGT2 <-aggregate(GT2Omit[3:4], by=GT2Omit[c("country_txt", "iyear")], FUN=sum, na.rm=TRUE)
> aggGT2
  country_txt iyear V1 V2
1          UK  1970  1  3
2         USA  1970  1  3
3          UK  1971  5 13
4         USA  1971  2  2
5         USA  1972  3  6

Upvotes: 2

user3603486
user3603486

Reputation:

dplyr is almost always the answer nowadays.

library(dplyr)
aggGT1 <- GT1 %>% group_by(iyear, country_txt) %>% summarize(sv1=sum(V1), sv2=sum(V2))

Having said that, it is good to learn basic R functions like aggregate and by.

Upvotes: 1

Related Questions