Yukun
Yukun

Reputation: 325

sum up numbers by blocks in R

I want to sum up numbers by blocks:

Here is a sample data

 data=matrix(c(0,0,0,1,1,0,1,1,1,1,1,0,0,1,0,0,1.2,2.3,1.3,1.5,2.5,2.1,2.3,1.2),
             ncol=3,dimnames=list(c(),c("low","high","time")))

     low high time
 [1,]   0    1  1.2
 [2,]   0    1  2.3
 [3,]   0    1  1.3
 [4,]   1    0  1.5
 [5,]   1    0  2.5
 [6,]   0    1  2.1
 [7,]   1    0  2.3
 [8,]   1    0  1.2

I want to get

       n  sum
 [1,]  3  4.8
 [2,]  2  4
 [3,]  1  2.1
 [4,]  2  3.5

without using any package. How to do that with R?

Or if I can get

       n/low n/high sum
 [1,]  0       3    4.8
 [2,]  2       0    4
 [3,]  0       1    2.1
 [4,]  2       0    3.5

Upvotes: 2

Views: 645

Answers (4)

Yukun
Yukun

Reputation: 325

I also find a similar option:

 aggregate(df,list(c(0,cumsum(abs(diff(df$low))))),sum)[-1]

For me it is more straightforward to understand.

Upvotes: 0

sanmath
sanmath

Reputation: 23

I have solved the problem, I think that is a little bit complicated but it works¡¡.

Well, I have generated every column using loops.

1) I have count every change

 data<-data.frame(data)
 ind1<-vector(mode="numeric", length=0)
 ind1[1]<-1
 for(i in c(2:8))
   ind[i]<-ifelse(data[i,1:2]==data[i-1,1:2],ind1[i-1],ind1[i-1]+1)

Then I have generated the sum with loops also.

ind<-c(1.2,0,0,0)
k<-1

for(i in c(2:8)){
  if(data[i,1:2]==data[i-1,1:2]){
     ind2[k]<-ind2[k]+data[i,3]
  }else{
      k<-k+1
      ind2[k]<-ind2[k]+data[i,3]
}}


  result<-cbind(data.frame(table(ind1))$Freq,ind2)

However I have gotten some warnings, but I think that is not a problem.

Upvotes: 0

Pierre L
Pierre L

Reputation: 28451

Not sure why the constraint on packages. They can make this much easier. We can create an index by using the unique combinations of the first two columns. Then aggregate with the index for grouping. Add a line for setting the names up and data frame structure:

ind <- with(rle(do.call(paste, df1[1:2])), rep(1:length(values), lengths))
a <- aggregate(df1$time, list(ind), function(x) c(length(x), sum(x)))[-1]
setNames(do.call(data.frame, a), c("n", "sum"))

  n sum
1 3 4.8
2 2 4.0
3 1 2.1
4 2 3.5

To illustrate how simple it is with help from data.table:

library(data.table)
setDT(df1)[, .(.N, sum(time)), by=rleid(low, high)]

Update

For follow-up question, see @bgoldst answer in comments.

Upvotes: 9

Joachim Isaksson
Joachim Isaksson

Reputation: 180977

A similar option, also using aggregate;

aggregate(cbind(n=1,sum=df$time), 
          by=list(c(0, cumsum(abs(diff(df$low))))), 
          FUN=sum)[-1]

Upvotes: 3

Related Questions