brunktier
brunktier

Reputation: 75

Get multiple/grouped sums for a vector with values separated by NA's

I have an update on my previous question:

c(123, 4525, 4365, 234, 674, NA, NA, NA, NA, NA, NA, NA, 
  24, 347, 457, 3246, 234, 5, 346, NA, NA, NA, NA, NA, NA) # [... and so on]

Is there any way to get me the sums for each of my pack of values separated by my NA's? Both the values and the NA's separate in their length over the vector and that's where I see the problem…

Ronak Shah's answer was very helpful but there remains a problem: I have some packs of values, which equal 0 in their sum. But this is an important information for me!
So if I use new[new != 0] I cut them off and I have no information on which sums belong to which pack of values in the end.

Upvotes: 0

Views: 61

Answers (3)

akrun
akrun

Reputation: 887511

We can do this with aggregate and rleid

library(data.table)
i1 <- is.na(x)
aggregate(cbind(val = x[!i1])~ cbind(grp = rleid(i1)[!i1]), FUN = sum)
#  grp  val
#1   1 9921
#2   3 4659
#3   5 5289
#4   7    0
#5   9    0

Upvotes: 0

AkselA
AkselA

Reputation: 8846

This might be a bit convoluted. The logic is sound, but there's quite possibly a way to simplify it a bit.

c(123, 4525, 4365, 234, 674, NA, NA, NA, NA, NA, NA, NA, 24, 347, 457, 3246,
234, 5, 346, NA, NA, NA, NA, NA, NA, 45, 778, 986, 3345, 135, NA, NA, NA, NA,
0, 0, NA, NA, 99, -2, -97, NA, NA) -> x

isna <- !is.na(x)
ix <- c(0, diff(isna)) + isna
ix[ix == 1] <- 0
ix <- cumsum(ix) + 1

ix <- ix * as.integer(isna)

sapply(split(x, ix)[-1], sum)
#    1    2    3    4    5 
# 9921 4659 5289    0    0 

What happens is that I through various logical and arithmetic operations create an index that has a unique number corresponding to each run of non-nas. Then the vector is split along this index and each resulting element is summed.


Taking inspiration from Moody, here's an rle()-based solution

notnaruns <- function(x) {
    notna <- !is.na(x)
    notnarl <- rle(isna)$lengths
    repruns <- rep(1:length(notnarl), notnarl) + 1
    repruns * notna * 0.5
}

tapply(x, notnaruns(x), sum)[-1]
#    1    2    3    4    5 
# 9921 4659 5289    0    0 

Upvotes: 0

moodymudskipper
moodymudskipper

Reputation: 47330

You could use data.table::rleid :

library(data.table)
tapply(x[!is.na(x)], rleid(is.na(x))[!is.na(x)], sum)
#    1    3    5    7    9 
# 9921 4659 5289    0    0 

Upvotes: 1

Related Questions