user14316891
user14316891

Reputation: 29

Find replicates/duplicates in a vector in R

This is a very simple one. I have a vector containing replicates of two different values. I want to calculate the sum of replicates of each value. An example of my vector:

> m <- c(rep(420,20), 421,rep(420,5),421,420,420,421,421,rep(420,3))
> m
 [1] 420 420 420 420 420 420 420 420 420 420 420 420 420 420 420 420 420 420 420 420 421 420 420 420 420 420 421
[28] 420 420 421 421 420 420 420

My vector contains lots of consecutive values of 420. I used the function rle():

> rle(m)
Run Length Encoding
  lengths: int [1:7] 20 1 5 1 2 2 3
  values : num [1:7] 420 421 420 421 420 421 420

This gives me number of replicates but it gives the replicates one by one. How to calculate how many consecutive 420s are in my vector?

Let's say i have another vector

> n <- c(1,1,2,3,1,2,3,1,4,5,1,1,1,6,5,6) 
> n  [1] 1 1 2 3 1 2 3 1 4 5 1 1 1 6 5 6
> with(rle(n), tapply(lengths, values, FUN = sum))
 1 2 3 4 5 6  
 7 2 2 1 2 2 

Here it says there are seven 1s. But actually there are 5 consecutive 1s.How to calculate that one?

Upvotes: 1

Views: 140

Answers (1)

akrun
akrun

Reputation: 886948

From the rle output, we can do a group by operation with tapply

lst1 <-  within.list(rle(m), {i1 <- lengths > 1
        lengths <- lengths[i1]
         values <- values[i1]})

with(lst1, tapply(lengths, values, FUN = sum))
#   420 421 
#  30   2 

For the vector n

lst1 <-  within.list(rle(n), {i1 <- lengths > 1
        lengths <- lengths[i1]
         values <- values[i1]})

with(lst1, tapply(lengths, values, FUN = sum))
#1 
#5 

Upvotes: 1

Related Questions