S Das
S Das

Reputation: 3391

Count of the runs from x to y

I am trying to perform a runs count from two random variables x and y.

x <- rnorm(30, mean = 4, sd = 1)
y <- rnorm(20, mean = 2.5, sd = 1)
nx <- length(x)
ny <- length(y)
data <- c(x, y)
names(data) <- c(rep("x", nx), rep("y", ny))
data <- sort(data)
rank <- rank(data)
rbind(data, rank)

             y        y        y        y        x        y        y        x        x         y         y        x         y
data 0.6814071 1.014124 1.049729 1.050243 1.164338 1.813754 1.955806 1.973856 2.013982  2.065336  2.402596  2.40338  2.445579
rank 1.0000000 2.000000 3.000000 4.000000 5.000000 6.000000 7.000000 8.000000 9.000000 10.000000 11.000000 12.00000 13.000000
             x         y         x         y         x         y        x         y         x         x         x         y
data  2.495905  2.533128  2.605192  2.675883  2.705004  2.740396  2.84131  2.841654  2.886925  3.024089  3.115692  3.246089
rank 14.000000 15.000000 16.000000 17.000000 18.000000 19.000000 20.00000 21.000000 22.000000 23.000000 24.000000 25.000000
             x         x         x         y         x         x         y         y         x       y         x         x
data  3.389303  3.398962  3.606407  3.657708  3.716344  3.763198  3.895701  3.944308  3.955861  3.9881  4.022458  4.075013
rank 26.000000 27.000000 28.000000 29.000000 30.000000 31.000000 32.000000 33.000000 34.000000 35.0000 36.000000 37.000000
             x        y         x         x        x         y         x         x         x         x         x         x
data  4.151537  4.21085  4.245625  4.355177  4.35652  4.409624  4.522272  4.541122  4.616041  4.619815  4.696114  4.988771
rank 38.000000 39.00000 40.000000 41.000000 42.00000 43.000000 44.000000 45.000000 46.000000 47.000000 48.000000 49.000000
             x
data  5.591174
rank 50.000000

names(data)

 [1] "y" "y" "y" "y" "x" "y" "y" "x" "x" "y" "y" "x" "y" "x" "y" "x" "y" "x" "y" "x" "y" "x" "x" "x" "y" "x" "x" "x" "y" "x" "x"
[32] "y" "y" "x" "y" "x" "x" "x" "y" "x" "x" "x" "y" "x" "x" "x" "x" "x" "x" "x"

In the final line [names(data)], if the sequence is from "y" to "x" ( or from "x" to y") then the run count is 1 and then the next run as 2, and so on. From "x" to "x" or "y" to "y", the run count is 0. From this total 50 values I like to get the total runs count. I am trying to use "rle" function, but I am not reaching the output.

Thanks in advance.

Upvotes: 0

Views: 105

Answers (1)

IRTFM
IRTFM

Reputation: 263421

So it would appear that the answer is:

length( rle(names(data))$values )

> rle(names(data))
Run Length Encoding
  lengths: int [1:16] 6 1 7 1 2 2 1 2 1 1 ...
  values : chr [1:16] "y" "x" "y" "x" "y" "x" "y" "x" "y" "x" "y" ...
> nx+ny
[1] 50

In that particular run the answer was 16 but in yours it was apparently larger. You should use set.seed(.) to construct reproducible examples.

Upvotes: 4

Related Questions