user3446735
user3446735

Reputation: 125

Calculate row sum value in R

Hi I am new to R and would like to get some advice on how to perform sum calculation in data frame structure.

       year value
Row 1  2001  10
Row 2  2001  20
Row 3  2002  15
Row 4  2002  NA
Row 5  2003  5

How can I use R to return the total sum value by year? Many thanks!

       year  sum value
Row 1  2001  30
Row 2  2002  15
Row 3  2003  5

Upvotes: 0

Views: 208

Answers (3)

Rich Scriven
Rich Scriven

Reputation: 99371

There is also rowsum, which is quite efficient

with(mydf, rowsum(value, year, na.rm=TRUE))
#      [,1]
# 2001   30
# 2002   15
# 2003    5

Or tapply

with(mydf, tapply(value, year, sum, na.rm=TRUE))
# 2001 2002 2003 
#   30   15    5 

Or as.data.frame(xtabs(...))

as.data.frame(xtabs(mydf[2:1]))
#   year Freq
# 1 2001   30
# 2 2002   15
# 3 2003    5

Upvotes: 2

Mark
Mark

Reputation: 4537

LyzandeR has provided a working answer in base R. If you want to use dplyr which is a great data management tool you could do:

year <- c(2001,2001,2002,2002,2003)
value <- c(10,20,15,NA,5)
mydf<-data.frame(year,value)

mydf %>%
  group_by(year) %>%
  summarise(sum_values = sum(value,na.rm=T))

The advantage of dplyr in this case is for larger datasets it will be much, much faster than base R. I also believe it's much more readable.

Upvotes: 1

LyzandeR
LyzandeR

Reputation: 37889

There are lots of ways to do that. One of them is using the function aggregate like this:

year <- c(2001,2001,2002,2002,2003)
value <- c(10,20,15,NA,5)
mydf<-data.frame(year,value)


mytable <- aggregate(mydf$value, by=list(year), FUN=sum, na.rm=TRUE)
colnames(mytable) <- c('Year','sum_values')

> mytable
  Year sum_values
1 2001         30
2 2002         15
3 2003          5

This link might also be helpful.

Upvotes: 2

Related Questions