Reputation: 309
I'm essentially given a paired sample say:
X = c(14, 5, 2, 8 , 9, 10)
Y = c(7, 3, 4, 13, 11, 12)
If I sort and pair the two samples into say Z what function can I use to record the number of ranks in Z?
Z = c(2, 3, 4, 7, 8, 9, 10, 11, 12, 13)
so Z is now
Z = (X, Y, Y, Y, X, X, X, Y, Y, Y, X)
How do i count the number of X-runs which in this case is 3 of sizes 1, 2 and 1 I've tried the rle() function but I don't understand how to return the different X and Y's
Upvotes: 0
Views: 64
Reputation: 26248
To get the number of runs of each value in Z
, you can use rle()
, firstly by finding which values of Z
are in X or Y
, then subsetting that again on the values that are TRUE
rle(Z %in% X)$lengths[rle(Z %in% X)$values]
#[1] 1 3
rle(Z %in% Y)$lengths[rle(Z %in% Y)$values]
#[1] 3 3
Which, as @docendo discimus points out can be written as
with(rle(Z %in% X), lengths[values])
with(rle(Z %in% Y), lengths[values])
Where
Z %in% X ## gives
TRUE FALSE FALSE FALSE TRUE TRUE TRUE FALSE FALSE FALSE
So then using rle
on the TRUE/FALSE
vector gives us our runs of each TRUE/FALSE
rle(Z %in% X) ## gives
Run Length Encoding
lengths: int [1:4] 1 3 3 3
values : logi [1:4] TRUE FALSE TRUE FALSE
So we can take the lenghts
and values
components separately, and subset the lenghts
where values == TRUE
Data
X <- c(14, 5, 2, 8 , 9, 10)
Y <- c(7, 3, 4, 13, 11, 12)
Z <- c(2, 3, 4, 7, 8, 9, 10, 11, 12, 13)
Upvotes: 3