Reputation: 537
For the data frame below I want to perform kolmogorov-smirnov tests for multiple columns. Column ID is the record ID, A-D are factors consisting of 2 levels ('Other' and A,B,C,D respectively. My test variable is in column E.
Now I would like to perform 4 KS tests:
In reality, I have 80 columns, so I'm looking for a way to perform these 80 tests 'Simultaneously'
ID A B C D E
1 1 O B C O 1
2 2 O O O O 3
3 3 O O O D 2
4 4 A O C D 7
5 5 A B O O 12
6 6 O O O O 4
7 7 O B O O 8
Upvotes: 1
Views: 1538
Reputation: 1400
I hope this solves your problem:
dat <- read.table("path/data.txt") # your data imported into my session.
cols <- c("A", "B", "C", "D") #these are the your columnss with categories. We leave the others out.
E <- dat$E # but save the E variable
lapply(cols, function(i){ # Evaluate E at each level of each column
x <- factor(dat[,i])
a <- E[x == levels(x)[1]]
b <- E[x == levels(x)[2]]
ks.test(a, b)
}) #you get a list with the results for each column
Upvotes: 3