Ben
Ben

Reputation: 451

Vector order when passing back and forth to a function

Say I have a dataframe with the rownames as the concatenated names of a numerator and a denominator. The two columns of this data frame are an entry for this dataframe are the numerator and denominator.

up<-c("up1","up2","up3","up4")
down<-c("down1","down2","down3")

singleRatio<-as.data.frame(expand.grid(up,down))
rownames(singleRatio)<-paste(singleRatio$Var1,singleRatio$Var2,sep="_")
colnames(singleRatio)<-c("numerator","denominator")

Each numerator and denominator has corresponding entries in a dataframe with the num/denom as rows and samples as columns.

sample1<-c(1,2,3,4,5,1,2)
sample2<-c(5,4,5,7,2,2,3)
sample3<-c(2,3,6,5,3,2,3)
sample4<-c(5,5,5,8,1,2,3)
data<-data.frame(sample1,sample2,sample3,sample4)
rownames(data)<-c(up,down)

I want to create a dataframe full of a test result where I calculate all of the ratios and compare them to a threshold (1 if it's over a threshold, 0 if it's under). This creates the ratios.df:

ratios.df<-data.frame(matrix(nrow = length(rownames(singleRatio)),ncol = length(colnames(data)) ))
rownames(ratios.df)<-rownames(singleRatio)
colnames(ratios.df)<-colnames(data)
ratios.df

I've got a function called getRatio to find all the ratios for each sample:

getRatio<-function(sampleData){
  sampleRatios<-rep(0,each=length(rownames(singleRatio)))
  names(sampleRatios)<-rownames(singleRatio)
  for( ratio in rownames(singleRatio)){ 
    sampleRatios[ratio]<-sampleData[singleRatio[ratio,1]]/(sampleData[singleRatio[ratio,1]] + sampleData[singleRatio[ratio,2]])
  }
  return(sampleRatios)
}

And this is my attempt to bring everything together.

thresholds<-c(0.1,0.5,0.1,0.5,0.1,0.5,0.1,0.5,0.1,0.5,0.1,0.5)
for (sampleName in colnames(data)){
  dataline<-data[,sampleName]
  names(dataline)<-rownames(data)
  sampleRatios<-getRatio(dataline)
  ratios.df[,sampleName]<-sampleRatios
  #ratios.df[,sampleName]<-ifelse(sampleRatios > thresholds,1,0)
}

The problem being that when I look at the resulting ratios, nothing matches. ratio.df ends up being :

> ratios.df
            sample1   sample2   sample3   sample4
up1_down1 0.5000000 0.5000000 0.5000000 0.5000000
up2_down1 0.6666667 0.4444444 0.6000000 0.5000000
up3_down1 0.7500000 0.5000000 0.7500000 0.5000000
up4_down1 0.8000000 0.5833333 0.7142857 0.6153846
up1_down2 0.3333333 0.5555556 0.4000000 0.5000000
up2_down2 0.5000000 0.5000000 0.5000000 0.5000000
up3_down2 0.6000000 0.5555556 0.6666667 0.5000000
up4_down2 0.6666667 0.6363636 0.6250000 0.6153846
up1_down3 0.2500000 0.5000000 0.2500000 0.5000000
up2_down3 0.4000000 0.4444444 0.3333333 0.5000000
up3_down3 0.5000000 0.5000000 0.5000000 0.5000000
up4_down3 0.5714286 0.5833333 0.4545455 0.6153846

And the original data is

> data
      sample1 sample2 sample3 sample4
up1         1       5       2       5
up2         2       4       3       5
up3         3       5       6       5
up4         4       7       5       8
down1       5       2       3       1
down2       1       2       2       2
down3       2       3       3       3

means that the ratio for up1_down1 for sample1 should be 1/(1+5)=0.33, not 0.50. Long story short, I have no idea why or even where things are getting swapped around in here. Anyone able to see what I'm doing wrong?

Upvotes: 0

Views: 37

Answers (1)

digEmAll
digEmAll

Reputation: 57210

The problem is that singleRatio data.frame contains two columns of factors and not characters, so when you do this kind of selections : sampleData[singleRatio[ratio,1]] the factor is coerced to integer instead of getting its string representation therefore the wrong value is selected.

If you create a data.frame of characters (changing the following line), everything should work :

singleRatio<-as.data.frame(expand.grid(up,down,stringsAsFactors=FALSE))

Result :

> ratios.df
            sample1   sample2   sample3   sample4
up1_down1 0.1666667 0.7142857 0.4000000 0.8333333
up2_down1 0.2857143 0.6666667 0.5000000 0.8333333
up3_down1 0.3750000 0.7142857 0.6666667 0.8333333
up4_down1 0.4444444 0.7777778 0.6250000 0.8888889
up1_down2 0.5000000 0.7142857 0.5000000 0.7142857
up2_down2 0.6666667 0.6666667 0.6000000 0.7142857
up3_down2 0.7500000 0.7142857 0.7500000 0.7142857
up4_down2 0.8000000 0.7777778 0.7142857 0.8000000
up1_down3 0.3333333 0.6250000 0.4000000 0.6250000
up2_down3 0.5000000 0.5714286 0.5000000 0.6250000
up3_down3 0.6000000 0.6250000 0.6666667 0.6250000
up4_down3 0.6666667 0.7000000 0.6250000 0.7272727

Upvotes: 1

Related Questions