oooo
oooo

Reputation: 177

How to the ranks of first sample from ranked data

I have two samples x and y and I am trying to compare them with a rank sum test. For the test statistics, I am trying to get sum of the ranks of the first sample with the following.

x <- c(1,2,3,4)
y <- c(2,4,5,7)

rank(sort(c(x,y)))
[1]  1.0 2.5 2.5 4.0 5.5 5.5 7.0 8.0

However, I don't know how to draw the ranks of the values of the first sample from that. Here's what I've tried

rank(sort(c(x,y)))[x]
[1] 1.0 2.5 2.5 4.0

but it returns the wrong answer, the right should be

 1.0 2.5 4.0 5.5

Upvotes: 1

Views: 112

Answers (3)

jay.sf
jay.sf

Reputation: 72593

Perhaps it might be safer to put the samples into a data frame, which you can easily achieve with stack().

(dat <- stack(list(x=x, y=y)))
#   values ind
# 1      1   x
# 2      2   x
# 3      3   x
# 4      4   x
# 5      2   y
# 6      4   y
# 7      5   y
# 8      7   y

Then you can do the subset explicitly by "x" and "y" which you obviously had in mind.

with(dat, rank(values)[ind == 'x'])
# [1] 1.0 2.5 4.0 5.5

with(dat, rank(values)[ind == 'y'])
# [1] 2.5 5.5 7.0 8.0

Upvotes: 1

akrun
akrun

Reputation: 886938

We may also use head on the length of 'x'

head(rank(c(x, y)), length(x))
[1] 1.0 2.5 4.0 5.5

Upvotes: 1

Allan Cameron
Allan Cameron

Reputation: 173793

You don't need to sort the concatenated x and y vector. The result of rank(c(x, y)) gives you the ranks of x then the ranks of y, so to get the ranks of x you can do:

rank(c(x, y))[seq_along(x)]
#> [1] 1.0 2.5 4.0 5.5

and to get the ranks of y, it's:

rank(c(y, x))[seq_along(y)]
#> [1] 2.5 5.5 7.0 8.0

Upvotes: 4

Related Questions