guy
guy

Reputation: 1131

R apply KS Test to 2 Matrices Row by Row

I have two matrices A & B in R with equal number of rows, but different number of columns.

I want to run a Kolomogrov-Smirnov test with row by row with each matrix. That is, the first test would be ks.test(as.vector(A[1,]), as.vector(B[1,]), the second would be ks.test(as.vector(A[2,]), as.vector(B[2,]) and so on. Ideally, storing the resulting of each test in a vector or dataframe.

I figured mapply would be appropriate but it keeps giving me way more results than expected. I think it is performing the tests element by element instead of row by row. This is my code chunk: mapply(ks.test, x=A, y=B)

Just testing the first row does not work as expected when I simply run: mapply(ks.test, x=as.vector(A[1,]), y=as.vector(B[1,]))

How can I get the desired output of N p-values where N is the number of rows of my original matrices.?

This is what the first rows of each of my matrices looks like:

> A[1,]

[1] 0 0 0 0 0 0 0 0 0

> B[1,]

 V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 

Upvotes: 0

Views: 763

Answers (2)

akrun
akrun

Reputation: 887501

We can use Map/mapply. While using Map/mapply we need to understand that it is applying the function on corresponding elements of the input data. If the input data are vectors, then the corresponding elements will be each element of the vector, similarly a matrix is a vector with dimensions. It will apply the function on each element. Therefore, we can split the matrix by row and then apply the function on corresponding list element

unlist(mapply(ks.test, split(A, row(A)), split(B, row(B)))[2,], use.names = FALSE)
#[1] 0.3571429 0.8730159 0.8730159 0.3571429 0.8730159

Or using a for loop

r1 <- numeric(nrow(A))
for(i in seq_len(nrow(A))){
    r1[i] <- ks.test(A[i,], B[i,])$p.value
 }
r1
#[1] 0.3571429 0.8730159 0.8730159 0.3571429 0.8730159

data

set.seed(24)
A <- matrix(rnorm(25), 5, 5)
set.seed(42)
B <- matrix(rnorm(25), 5, 5)

Upvotes: 2

G5W
G5W

Reputation: 37661

You can get what you want with sapply used on the row indices

sapply(1:nrow(A), function(i) ks.test(as.vector(A[i,]), as.vector(B[i,])))

actually, it looks like the only interesting part is the p values so this could be simplified with

sapply(1:nrow(A), function(i) ks.test(as.vector(A[i,]), as.vector(B[i,]))$p)
[1] 0.01587302 0.01587302 0.01587302 0.01587302

Upvotes: 2

Related Questions