Reputation: 1131
I have two matrices A & B in R with equal number of rows, but different number of columns.
I want to run a Kolomogrov-Smirnov test with row by row with each matrix. That is, the first test would be ks.test(as.vector(A[1,]), as.vector(B[1,])
, the second would be ks.test(as.vector(A[2,]), as.vector(B[2,])
and so on. Ideally, storing the resulting of each test in a vector or dataframe.
I figured mapply
would be appropriate but it keeps giving me way more results than expected. I think it is performing the tests element by element instead of row by row. This is my code chunk:
mapply(ks.test, x=A, y=B)
Just testing the first row does not work as expected when I simply run:
mapply(ks.test, x=as.vector(A[1,]), y=as.vector(B[1,]))
How can I get the desired output of N p-values where N is the number of rows of my original matrices.?
This is what the first rows of each of my matrices looks like:
> A[1,]
[1] 0 0 0 0 0 0 0 0 0
> B[1,]
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Upvotes: 0
Views: 763
Reputation: 887501
We can use Map/mapply
. While using Map/mapply
we need to understand that it is applying the function on corresponding elements of the input data. If the input data are vectors, then the corresponding elements will be each element of the vector, similarly a matrix
is a vector
with dimensions. It will apply the function on each element. Therefore, we can split
the matrix
by row
and then apply the function on corresponding list
element
unlist(mapply(ks.test, split(A, row(A)), split(B, row(B)))[2,], use.names = FALSE)
#[1] 0.3571429 0.8730159 0.8730159 0.3571429 0.8730159
Or using a for
loop
r1 <- numeric(nrow(A))
for(i in seq_len(nrow(A))){
r1[i] <- ks.test(A[i,], B[i,])$p.value
}
r1
#[1] 0.3571429 0.8730159 0.8730159 0.3571429 0.8730159
set.seed(24)
A <- matrix(rnorm(25), 5, 5)
set.seed(42)
B <- matrix(rnorm(25), 5, 5)
Upvotes: 2
Reputation: 37661
You can get what you want with sapply
used on the row indices
sapply(1:nrow(A), function(i) ks.test(as.vector(A[i,]), as.vector(B[i,])))
actually, it looks like the only interesting part is the p values so this could be simplified with
sapply(1:nrow(A), function(i) ks.test(as.vector(A[i,]), as.vector(B[i,]))$p)
[1] 0.01587302 0.01587302 0.01587302 0.01587302
Upvotes: 2