aurelius_37809
aurelius_37809

Reputation: 195

How to write Anderson-Darling Test p-values loop?

I have this matrix called matrix_1:

    c1  c2  c3  c4  c5
R1  27  38  94  40  4
R2  69  16  85  2   15
R3  30  35  64  95  6
R4  20  33  77  98  55
R5  20  44  60  33  89
R6  12  88  87  44  38

I would like to run an Anderson-Darling test (ad.test()) in a loop to compare the distribution of each column with a vector vector_a. I want the function to just return the p-value from version 1. Here is an example output using just one column compare to vector_a:

T.AD = ( Anderson-Darling  Criterion - mean)/sigma

Null Hypothesis: All samples come from a common population.

             AD  T.AD  asympt. P-value
version 1: 12.9 15.72        2.416e-07
version 2: 12.9 15.76        2.371e-07

I am trying this:

sapply(1:ncol(matrix_1), function(i) ad.test(as.vector(matrix_1[,1:i]), vector_a)$p)

but it overloads the cpu and I am not getting a result.

Upvotes: 0

Views: 142

Answers (1)

Brian Syzdek
Brian Syzdek

Reputation: 948

It's good form to identify the package you are using

library(kSamples)

The test results are in $ad. Version 1 is the first row. P-value is third column, so you can capture this with

'output'$ad[1,3]

Using a sample vector, and setting up your matrix data

vector_a <- sample(0:100, 6)
rownames <- paste0("R", seq(1,6))
colnames <- paste0("C", seq(1,5))
matrix_1 <- matrix(
c(27,  38,  94,  40,  4,
69,  16,  85,  2,   15,
30,  35,  64,  95,  6,
20,  33,  77,  98,  55,
20,  44,  60,  33,  89,
12,  88,  87,  44,  38),
nrow = 6, ncol = 5, , dimnames = list(rownames, colnames))

You can use the apply function, specifying '2' to iterate over columns

apply(matrix_1, 2, function(matrix_column) ad.test(as.vector(matrix_column), vector_a)$ad[1,3])

Gives the version 1 p-value for each column

     C1      C2      C3      C4      C5 
0.12623 0.02507 0.39935 0.81181 0.28477 

EDIT to address comment about the one step function matrix_column is the function's argument name. It can be any name you wish. Here is answer broken into parts:

# Define function
ad_function <- function(matrix_column){
  ad_test_results <- ad.test(as.vector(matrix_column), vector_a) # ad.test comparing matrix_column (columns of matrix) and vector_a. Assign results to ad_test_results
  ad_test_results$ad[1,3] # This gets the p-value for version 1
}
# Now apply the matrix columns to the function
apply(matrix_1, 2, ad_function)

Upvotes: 1

Related Questions