Reputation: 195
I have this matrix called matrix_1
:
c1 c2 c3 c4 c5
R1 27 38 94 40 4
R2 69 16 85 2 15
R3 30 35 64 95 6
R4 20 33 77 98 55
R5 20 44 60 33 89
R6 12 88 87 44 38
I would like to run an Anderson-Darling test (ad.test()) in a loop to compare the distribution of each column with a vector vector_a
. I want the function to just return the p-value from version 1
. Here is an example output using just one column compare to vector_a
:
T.AD = ( Anderson-Darling Criterion - mean)/sigma
Null Hypothesis: All samples come from a common population.
AD T.AD asympt. P-value
version 1: 12.9 15.72 2.416e-07
version 2: 12.9 15.76 2.371e-07
I am trying this:
sapply(1:ncol(matrix_1), function(i) ad.test(as.vector(matrix_1[,1:i]), vector_a)$p)
but it overloads the cpu and I am not getting a result.
Upvotes: 0
Views: 142
Reputation: 948
It's good form to identify the package you are using
library(kSamples)
The test results are in $ad. Version 1 is the first row. P-value is third column, so you can capture this with
'output'$ad[1,3]
Using a sample vector, and setting up your matrix data
vector_a <- sample(0:100, 6)
rownames <- paste0("R", seq(1,6))
colnames <- paste0("C", seq(1,5))
matrix_1 <- matrix(
c(27, 38, 94, 40, 4,
69, 16, 85, 2, 15,
30, 35, 64, 95, 6,
20, 33, 77, 98, 55,
20, 44, 60, 33, 89,
12, 88, 87, 44, 38),
nrow = 6, ncol = 5, , dimnames = list(rownames, colnames))
You can use the apply function, specifying '2' to iterate over columns
apply(matrix_1, 2, function(matrix_column) ad.test(as.vector(matrix_column), vector_a)$ad[1,3])
Gives the version 1 p-value for each column
C1 C2 C3 C4 C5
0.12623 0.02507 0.39935 0.81181 0.28477
EDIT to address comment about the one step function
matrix_column
is the function's argument name. It can be any name you wish. Here is answer broken into parts:
# Define function
ad_function <- function(matrix_column){
ad_test_results <- ad.test(as.vector(matrix_column), vector_a) # ad.test comparing matrix_column (columns of matrix) and vector_a. Assign results to ad_test_results
ad_test_results$ad[1,3] # This gets the p-value for version 1
}
# Now apply the matrix columns to the function
apply(matrix_1, 2, ad_function)
Upvotes: 1