Reputation: 403
I am new to R and I have 25 samples of RNAseq results. I would like to apply the same functions to calculate correlation of my target gene (say like gene ABC) to all the 25 samples.
I know how to do this individually. Here is my code to do it:
df <- read.table("Sample1.txt", header=T, sep="\t")
# gene expression values of interest
gene <-as.numeric(df["ABC",])
# correlate gene with all others genes in the expression set
correlations <- apply(df,1,function(x){cor(gene,x)})
But now I have 25 of them. I use lapply to read them all at once.
data <- c("Sample1.txt", "Sample2.txt",..."Sample25.txt")
df <- lapply(data, read.table)
names(df) <- data
However, I am lost on how to connect it with the rest of my code above to calculate gene correlation. I have read some of the related threads but still could not figure it out. Could anyone help me? Thanks!
Upvotes: 4
Views: 4694
Reputation: 12559
You should do:
files <- c("Sample1.txt", "Sample2.txt", ..., "Sample25.txt")
myfunc <- function(file) {
df <- read.table(file, header=TRUE, sep="\t")
# gene expression values of interest
gene <- as.numeric(df["ABC",])
# correlate gene with all others genes in the expression set
correlations <- apply(df, 1, function(x) cor(gene, x) )
}
lapply(files, myfunc)
That is the style I recommend for you. This is the style I would do:
myfunc <- function(file) {
df <- read.table(file, header=TRUE, sep="\t")
gene <- as.numeric(df["ABC",]) # gene expression values of interest
apply(df, 1, FUN=cor, y=gene) # correlate gene with all others
}
files <- c("Sample1.txt", "Sample2.txt", ..., "Sample25.txt")
lapply(files, myfunc)
Probably you want to save the results to an object:
L <- lapply(files, myfunc)
For the function one can even do (because cor()
takes matrix arguments)):
myfunc <- function(file) {
df <- read.table(file, header=TRUE, sep="\t")
cor(t(df), y=as.numeric(df["ABC",])) # correlate gene with all others
}
Upvotes: 3