user86533
user86533

Reputation: 333

R ExpressionSet filter NA values

I want to create the following ExpressionSet in R:

dataDirectory <- system.file("extdata", package = "Biobase")
exprsFile <- "path to expression data.txt"
exprs <- as.matrix(read.table(exprsFile, header = TRUE, sep = "\t", row.names = 1, as.is = TRUE))

pDataFile <- "path to phenotype data.txt"
pData <- read.table(pDataFile, row.names=1, header=TRUE, sep="\t")
phenoData <- new("AnnotatedDataFrame",data=pData)

Now delete those columns from the exprs with more than 80% of NA values

exprs <- exprs[,colSums(is.na(exprs)) < 0.8]

Before I can execute the following code & build the ExpressionSet I have to delete all the rows in the phenoData (=samples) that match the above deleted columns in the exprs. How can I achieve that?

exampleSet <- ExpressionSet(assayData=exprs, phenoData=phenoData)
exampleSet

Upvotes: 0

Views: 711

Answers (1)

Martin Morgan
Martin Morgan

Reputation: 46866

Build the ExpressionSet (without filtering)

exampleSet <- ExpressionSet(assayData=exprs, phenoData=phenoData)

and then subset, using the exprs() function to work with the underling matrix of expression values:

exampleSet[, colSums(is.na(exprs(exampleSet))) < 0.8]

Ask questions about Bioconductor packages on the Bioconductor support site.

Upvotes: 1

Related Questions