Reputation: 75
I have an update on my previous question:
c(123, 4525, 4365, 234, 674, NA, NA, NA, NA, NA, NA, NA,
24, 347, 457, 3246, 234, 5, 346, NA, NA, NA, NA, NA, NA) # [... and so on]
Is there any way to get me the sums for each of my pack of values separated by my NA's? Both the values and the NA's separate in their length over the vector and that's where I see the problem…
Ronak Shah's answer was very helpful but there remains a problem:
I have some packs of values, which equal 0 in their sum.
But this is an important information for me!
So if I use new[new != 0]
I cut them off and I have no information on which sums belong to which pack of values in the end.
Upvotes: 0
Views: 61
Reputation: 887511
We can do this with aggregate
and rleid
library(data.table)
i1 <- is.na(x)
aggregate(cbind(val = x[!i1])~ cbind(grp = rleid(i1)[!i1]), FUN = sum)
# grp val
#1 1 9921
#2 3 4659
#3 5 5289
#4 7 0
#5 9 0
Upvotes: 0
Reputation: 8846
This might be a bit convoluted. The logic is sound, but there's quite possibly a way to simplify it a bit.
c(123, 4525, 4365, 234, 674, NA, NA, NA, NA, NA, NA, NA, 24, 347, 457, 3246,
234, 5, 346, NA, NA, NA, NA, NA, NA, 45, 778, 986, 3345, 135, NA, NA, NA, NA,
0, 0, NA, NA, 99, -2, -97, NA, NA) -> x
isna <- !is.na(x)
ix <- c(0, diff(isna)) + isna
ix[ix == 1] <- 0
ix <- cumsum(ix) + 1
ix <- ix * as.integer(isna)
sapply(split(x, ix)[-1], sum)
# 1 2 3 4 5
# 9921 4659 5289 0 0
What happens is that I through various logical and arithmetic operations create an index that has a unique number corresponding to each run of non-nas. Then the vector is split along this index and each resulting element is summed.
Taking inspiration from Moody, here's an rle()
-based solution
notnaruns <- function(x) {
notna <- !is.na(x)
notnarl <- rle(isna)$lengths
repruns <- rep(1:length(notnarl), notnarl) + 1
repruns * notna * 0.5
}
tapply(x, notnaruns(x), sum)[-1]
# 1 2 3 4 5
# 9921 4659 5289 0 0
Upvotes: 0
Reputation: 47330
You could use data.table::rleid
:
library(data.table)
tapply(x[!is.na(x)], rleid(is.na(x))[!is.na(x)], sum)
# 1 3 5 7 9
# 9921 4659 5289 0 0
Upvotes: 1