Reputation: 451
Say I have a dataframe with the rownames as the concatenated names of a numerator and a denominator. The two columns of this data frame are an entry for this dataframe are the numerator and denominator.
up<-c("up1","up2","up3","up4")
down<-c("down1","down2","down3")
singleRatio<-as.data.frame(expand.grid(up,down))
rownames(singleRatio)<-paste(singleRatio$Var1,singleRatio$Var2,sep="_")
colnames(singleRatio)<-c("numerator","denominator")
Each numerator and denominator has corresponding entries in a dataframe with the num/denom as rows and samples as columns.
sample1<-c(1,2,3,4,5,1,2)
sample2<-c(5,4,5,7,2,2,3)
sample3<-c(2,3,6,5,3,2,3)
sample4<-c(5,5,5,8,1,2,3)
data<-data.frame(sample1,sample2,sample3,sample4)
rownames(data)<-c(up,down)
I want to create a dataframe full of a test result where I calculate all of the ratios and compare them to a threshold (1 if it's over a threshold, 0 if it's under). This creates the ratios.df:
ratios.df<-data.frame(matrix(nrow = length(rownames(singleRatio)),ncol = length(colnames(data)) ))
rownames(ratios.df)<-rownames(singleRatio)
colnames(ratios.df)<-colnames(data)
ratios.df
I've got a function called getRatio to find all the ratios for each sample:
getRatio<-function(sampleData){
sampleRatios<-rep(0,each=length(rownames(singleRatio)))
names(sampleRatios)<-rownames(singleRatio)
for( ratio in rownames(singleRatio)){
sampleRatios[ratio]<-sampleData[singleRatio[ratio,1]]/(sampleData[singleRatio[ratio,1]] + sampleData[singleRatio[ratio,2]])
}
return(sampleRatios)
}
And this is my attempt to bring everything together.
thresholds<-c(0.1,0.5,0.1,0.5,0.1,0.5,0.1,0.5,0.1,0.5,0.1,0.5)
for (sampleName in colnames(data)){
dataline<-data[,sampleName]
names(dataline)<-rownames(data)
sampleRatios<-getRatio(dataline)
ratios.df[,sampleName]<-sampleRatios
#ratios.df[,sampleName]<-ifelse(sampleRatios > thresholds,1,0)
}
The problem being that when I look at the resulting ratios, nothing matches. ratio.df ends up being :
> ratios.df
sample1 sample2 sample3 sample4
up1_down1 0.5000000 0.5000000 0.5000000 0.5000000
up2_down1 0.6666667 0.4444444 0.6000000 0.5000000
up3_down1 0.7500000 0.5000000 0.7500000 0.5000000
up4_down1 0.8000000 0.5833333 0.7142857 0.6153846
up1_down2 0.3333333 0.5555556 0.4000000 0.5000000
up2_down2 0.5000000 0.5000000 0.5000000 0.5000000
up3_down2 0.6000000 0.5555556 0.6666667 0.5000000
up4_down2 0.6666667 0.6363636 0.6250000 0.6153846
up1_down3 0.2500000 0.5000000 0.2500000 0.5000000
up2_down3 0.4000000 0.4444444 0.3333333 0.5000000
up3_down3 0.5000000 0.5000000 0.5000000 0.5000000
up4_down3 0.5714286 0.5833333 0.4545455 0.6153846
And the original data is
> data
sample1 sample2 sample3 sample4
up1 1 5 2 5
up2 2 4 3 5
up3 3 5 6 5
up4 4 7 5 8
down1 5 2 3 1
down2 1 2 2 2
down3 2 3 3 3
means that the ratio for up1_down1 for sample1 should be 1/(1+5)=0.33, not 0.50. Long story short, I have no idea why or even where things are getting swapped around in here. Anyone able to see what I'm doing wrong?
Upvotes: 0
Views: 37
Reputation: 57210
The problem is that singleRatio
data.frame
contains two columns of factors and not characters, so when you do this kind of selections : sampleData[singleRatio[ratio,1]]
the factor is coerced to integer instead of getting its string representation therefore the wrong value is selected.
If you create a data.frame of characters (changing the following line), everything should work :
singleRatio<-as.data.frame(expand.grid(up,down,stringsAsFactors=FALSE))
Result :
> ratios.df
sample1 sample2 sample3 sample4
up1_down1 0.1666667 0.7142857 0.4000000 0.8333333
up2_down1 0.2857143 0.6666667 0.5000000 0.8333333
up3_down1 0.3750000 0.7142857 0.6666667 0.8333333
up4_down1 0.4444444 0.7777778 0.6250000 0.8888889
up1_down2 0.5000000 0.7142857 0.5000000 0.7142857
up2_down2 0.6666667 0.6666667 0.6000000 0.7142857
up3_down2 0.7500000 0.7142857 0.7500000 0.7142857
up4_down2 0.8000000 0.7777778 0.7142857 0.8000000
up1_down3 0.3333333 0.6250000 0.4000000 0.6250000
up2_down3 0.5000000 0.5714286 0.5000000 0.6250000
up3_down3 0.6000000 0.6250000 0.6666667 0.6250000
up4_down3 0.6666667 0.7000000 0.6250000 0.7272727
Upvotes: 1