Reputation: 307
I have a dataframe (specifically a correlation matrix). I'd like to replace with NA any values in the matrix that do not have either an "*" or a "'" (i.e., omitting cells that are not statistically significant or marginally significant).
Data is something like this:
out <- data.frame(V1=c(NA,"-0.28**","-0.18'","-0.11"),
V2=c(NA,NA,"0.01","0.05"),
V3=c(NA,NA,NA,"0.30**"))
rownames(out) <- c("V1","V2","V3","V4")
Returning:
> out
V1 V2 V3
V1 <NA> <NA> <NA>
V2 -0.28** <NA> <NA>
V3 -0.18' 0.01 <NA>
V4 -0.11 0.05 0.30**
What I'd like is the same dataframe with the non-sig or marginally sig associations replaced with NA.
Like this:
> out
V1 V2 V3
V1 <NA> <NA> <NA>
V2 -0.28** <NA> <NA>
V3 -0.18' <NA> <NA>
V4 <NA> <NA> 0.30**
Upvotes: 0
Views: 59
Reputation: 81743
out[] <- lapply(out, function(x) "is.na<-"(x, grep("^[^*']+$", x)))
# V1 V2 V3
# V1 <NA> <NA> <NA>
# V2 -0.28** <NA> <NA>
# V3 -0.18' <NA> <NA>
# V4 <NA> <NA> 0.30**
Upvotes: 0
Reputation: 193687
My "SOfun" package has a function called makemeNA
that can be used for this:
Usage in this case would be:
makemeNA(out, "^[0-9.-]+$", fixed = FALSE)
# V1 V2 V3
# V1 <NA> NA <NA>
# V2 -0.28** NA <NA>
# V3 -0.18' NA <NA>
# V4 <NA> NA 0.30**
This basically says to replace anything that is just a number (positive or negative) with NA
.
Install the package with:
library(devtools)
install_github("mrdwab/SOfun")
Upvotes: 0
Reputation: 263481
Use negation of grepl
-call. Need to use sapply because there is no grepl.data.frame method. The pattern is an OR construct with characer classes. See ?regex
:
> out[ !sapply( out,grepl, patt="[']|[*]") ] <- NA
> out
V1 V2 V3
V1 <NA> <NA> <NA>
V2 -0.28** <NA> <NA>
V3 -0.18' <NA> <NA>
V4 <NA> <NA> 0.30**
Upvotes: 1
Reputation: 99371
You could also do
out[] <- lapply(out, function(x) { is.na(x) <- !grepl("[*']", x); x })
out
# V1 V2 V3
# V1 <NA> <NA> <NA>
# V2 -0.28** <NA> <NA>
# V3 -0.18' <NA> <NA>
# V4 <NA> <NA> 0.30**
Upvotes: 0