Reputation: 177
I have two samples x and y and I am trying to compare them with a rank sum test. For the test statistics, I am trying to get sum of the ranks of the first sample with the following.
x <- c(1,2,3,4)
y <- c(2,4,5,7)
rank(sort(c(x,y)))
[1] 1.0 2.5 2.5 4.0 5.5 5.5 7.0 8.0
However, I don't know how to draw the ranks of the values of the first sample from that. Here's what I've tried
rank(sort(c(x,y)))[x]
[1] 1.0 2.5 2.5 4.0
but it returns the wrong answer, the right should be
1.0 2.5 4.0 5.5
Upvotes: 1
Views: 112
Reputation: 72593
Perhaps it might be safer to put the samples into a data frame, which you can easily achieve with stack()
.
(dat <- stack(list(x=x, y=y)))
# values ind
# 1 1 x
# 2 2 x
# 3 3 x
# 4 4 x
# 5 2 y
# 6 4 y
# 7 5 y
# 8 7 y
Then you can do the subset explicitly by "x"
and "y"
which you obviously had in mind.
with(dat, rank(values)[ind == 'x'])
# [1] 1.0 2.5 4.0 5.5
with(dat, rank(values)[ind == 'y'])
# [1] 2.5 5.5 7.0 8.0
Upvotes: 1
Reputation: 886938
We may also use head
on the length
of 'x'
head(rank(c(x, y)), length(x))
[1] 1.0 2.5 4.0 5.5
Upvotes: 1
Reputation: 173793
You don't need to sort
the concatenated x
and y
vector. The result of rank(c(x, y))
gives you the ranks of x
then the ranks of y
, so to get the ranks of x
you can do:
rank(c(x, y))[seq_along(x)]
#> [1] 1.0 2.5 4.0 5.5
and to get the ranks of y
, it's:
rank(c(y, x))[seq_along(y)]
#> [1] 2.5 5.5 7.0 8.0
Upvotes: 4