Gabriele Midolo
Gabriele Midolo

Reputation: 241

Sum values in different rows sharing the same value in a column

Say I have the following dataset:

PlotName<- c(A,B,B,C,D,E,F,F,F)
NewValue<- c(1,2,1,3,0,0,2,1,3)
OldValue<- c(3,3,1,2,1,3,0,3,1)

I want to sum NewValue and OldValue values for the elements repeating in PlotName eliminating at the same tipe repeated elements (letters). For example, for 'B' NewValue=2+1=3 and OldValue=3+1=4

Namely:

PlotName<- c(A,B,C,D,E,F)
NewValue<- c(1,3,3,0,0,6)
OldValue<- c(3,4,2,1,3,4)

I could filter rows with repetead values in PlotName (e.g. with dplyr) and then sum the values individually but I am looking for a faster method to operate on a large datasets with many repeated values.

Upvotes: 2

Views: 1440

Answers (3)

GGamba
GGamba

Reputation: 13680

With dplyr:

library(dplyr)

data.frame(PlotName, NewValue, OldValue) %>% 
  group_by(PlotName) %>% 
  summarise_all(sum)

# # A tibble: 6 × 3
#   PlotName NewValue OldValue
#     <fctr>    <dbl>    <dbl>
# 1        A        1        3
# 2        B        3        4
# 3        C        3        2
# 4        D        0        1
# 5        E        0        3
# 6        F        6        4

Upvotes: 2

akrun
akrun

Reputation: 887651

We can do this with any one of the group by operations after creating a data.frame

aggregate(.~PlotName, data.frame(NewValue, OldValue, PlotName), FUN = sum)

Or another option is rowsum

rowsum(cbind(NewValue, OldValue), PlotName)
#   NewValue OldValue
#A        1        3
#B        3        4
#C        3        2
#D        0        1
#E        0        3
#F        6        4

A faster option is to convert to data.table and use the data.table methods

library(data.table)
data.table(NewValue, OldValue, PlotName)[, lapply(.SD, sum), PlotName] 

Upvotes: 2

d.b
d.b

Reputation: 32548

sapply(split(OldValue, PlotName), sum)
#A B C D E F 
#3 4 2 1 3 4 
sapply(split(NewValue, PlotName), sum)
#A B C D E F 
#1 3 3 0 0 6 

Upvotes: 1

Related Questions