Reputation: 13
I would like to use a one-sample Wilcoxon Signed Rank Test to test whether each column in R is significantly greater than 0. I can go through each column individually, but I would ideally like to use lapply
to cycle through each column and record the p-values in a separate dataframe. Each row of the dataframe lists monthly values for a given year:
df = data.frame("year"=c(1:20), "jan"=runif(20), "feb"=runif(20))
... with 13 total columns for year and each month.
The code I am using now compares each column to zero, but I would like to incorporate the lapply
function to streamline things a bit:
wilcox.test(df[,1], mu=0, alternative="greater")
I have tried:
res = lapply(df, function(x){
wilcox.test(df[,x[1]], mu=0, alternative="greater")
})
But I am getting an error that my input to the wilcox.test
function is not numeric which makes me think it is not reading in individual columns.I have tried using some suggestions in this post but am having trouble modifying the code to work for a one-sample test. I am new to lapply
and writing functions, so any help is greatly appreciated!
Upvotes: 1
Views: 975
Reputation: 24770
You can directly apply over columns in a data.frame
with lapply
. Make sure that you only pass columns that contain numeric values by subsetting to only those columns.
lapply(df[,2:13],function(x){wilcox.test(x, mu=0, alternative="greater")})
Your version doesn't work because you are trying to subset df
by an entire column of df
(ie df[,df[,1]]
, instead of df[,1]
).
To streamline things even further, you can use sapply
, and $p.value
to access just the p-value results.
sapply(df[,2:13],function(x){wilcox.test(x, mu=0, alternative="greater")$p.value})
# jan feb mar apr may jun #jul aug sep oct nov
#9.536743e-07 9.536743e-07 9.536743e-07 9.536743e-07 9.536743e-07 9.536743e-07 9.536743e-07 9.536743e-07 9.536743e-07 9.536743e-07 9.536743e-07
# dec
#9.536743e-07
Data
df <- data.frame(year = 1:20, lapply(rep(20,12),runif))
names(df)[2:13] <- tolower(month.abb)
Upvotes: 1