ghost
ghost

Reputation: 309

r - Count the number of ranks in a paired sample

I'm essentially given a paired sample say:

X = c(14, 5, 2, 8 , 9, 10)

Y = c(7, 3, 4, 13, 11, 12)

If I sort and pair the two samples into say Z what function can I use to record the number of ranks in Z?

Z = c(2, 3, 4, 7, 8, 9, 10, 11, 12, 13)

so Z is now

Z = (X, Y, Y, Y, X, X, X, Y, Y, Y, X)

How do i count the number of X-runs which in this case is 3 of sizes 1, 2 and 1 I've tried the rle() function but I don't understand how to return the different X and Y's

Upvotes: 0

Views: 64

Answers (1)

SymbolixAU
SymbolixAU

Reputation: 26248

To get the number of runs of each value in Z, you can use rle(), firstly by finding which values of Z are in X or Y, then subsetting that again on the values that are TRUE

rle(Z %in% X)$lengths[rle(Z %in% X)$values]
#[1] 1 3
rle(Z %in% Y)$lengths[rle(Z %in% Y)$values]
#[1] 3 3

Which, as @docendo discimus points out can be written as

with(rle(Z %in% X), lengths[values])
with(rle(Z %in% Y), lengths[values])

Where

Z %in% X ## gives
TRUE FALSE FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE

So then using rle on the TRUE/FALSE vector gives us our runs of each TRUE/FALSE

rle(Z %in% X)  ## gives
    Run Length Encoding
  lengths: int [1:4] 1 3 3 3
  values : logi [1:4] TRUE FALSE TRUE FALSE

So we can take the lenghts and values components separately, and subset the lenghts where values == TRUE


Data

X <- c(14, 5, 2, 8 , 9, 10)
Y <- c(7, 3, 4, 13, 11, 12)
Z <- c(2, 3, 4, 7, 8, 9, 10, 11, 12, 13)

Upvotes: 3

Related Questions