Ger
Ger

Reputation: 259

Summing only the available columns in a data frame

I have the following two data frames:

a<-1,2,3,4,5
b<-3,4,5,6,7
d<-"a","b","c","d","e"
df1<-data.frame(a,b,d)

a<-1,2,3,4,5
c<-3,4,5,6,7
e<-"a","b","c","d","e"
df2<-data.frame(a,c,e)

I want to get in both data frames a column called total which sums a+b+c. Since column c is missing in df1, and column b is missing in df2, I get the error "replacement has 0 rows, data has 5123" (uninitialized column)

But is there a way in which you can sum these columns by just skipping the missing columns? This means that I want to get the following:

df1$total<-sum(df1$a+df1$b)
df2$total<-sum(df2$a+df2$c)

Upvotes: 0

Views: 46

Answers (2)

s_baldur
s_baldur

Reputation: 33603

Given a slightly more general context (there are numeric columns in the data.frame that you don't want to include in the sum) you could do something like the following:

col2sum <- c("a", "b", "c")
df1$total <- rowSums(Filter(is.numeric, df1[names(df1) %in% col2sum]))
df1
  a b c f total
1 1 3 a 2     4
2 2 4 b 3     6
3 3 5 c 4     8
4 4 6 d 5    10
5 5 7 e 6    12
df2$total <- rowSums(Filter(is.numeric, df2[names(df2) %in% col2sum]))
df2
  a c d e total
1 1 4 a 1     5
2 2 5 b 2     7
3 3 6 c 3     9
4 4 7 d 4    11
5 5 8 e 5    13

Alternatively:

rowSums(df2[names(df2) %in% col2sum & sapply(df2, is.numeric)])

** Data **

df1 <- data.frame(a = 1:5, b = 3:7, c = letters[1:5], f = 2:6)
df2 <- data.frame(a = 1:5, c = 4:8, d = letters[1:5], e = 1:5)

Upvotes: 1

jay.sf
jay.sf

Reputation: 73612

Select with sapply(), then rowSums()

df1$total <- rowSums(df1[, sapply(df1, is.numeric)])

Yielding

> df1
  a b c total
1 1 3 a     4
2 2 4 b     6
3 3 5 c     8
4 4 6 d    10
5 5 7 e    12

Data

df1 <- data.frame(a=1:5, b=3:7, c=letters[1:5])

Upvotes: 1

Related Questions