Reputation: 201
I have 2 data.frames
> head(cont)
old_pert cmap_name conc perturb_geo t1 t2 t3 t4 t5
1 5202764005789148112904.A02 estradiol 0.00000001 GSM119257 GSM119218 GSM119219 GSM119221 GSM119222 GSM119223
2 5202764005789148112904.A01 valproic acid 0.00050000 GSM119256 GSM119218 GSM119219 GSM119221 GSM119222 GSM119223
> head(expression)[1:3,1:8]
GSM118911 GSM118912 GSM118913 GSM118723 GSM118724 GSM118725 GSM118726 GSM118727
1007_s_at 387.6 393.2 290.5 378.6 507.8 383.7 288.8 451.9
1053_at 56.4 53.5 32.8 39.0 71.5 47.3 46.0 50.1
117_at 6.3 33.6 19.2 17.6 20.3 15.0 7.1 43.1
I want to apply a loop to do:
for(i in 1:nrow(cont)){
first take some values from cont
which will be used ahead
vehicle <- cont[i, 5:9]
perturb <- cont[i, 4]
col_name <- paste(cont[i, 2], cont[i, 3], sep = '_') #estradiol_.00001
tmp <- sum(expression[,which(colnames(expression) == vehicle)])/5
tmp2 <- expression[,which(colnames(expression) == perturb)]
tmp3 <- tmp/tmp2
div <- cbind(div, tmp3)
colnames(div)[i + 1] <- col_name
}
Take those columns from expression
where col.names == vehicle & perturb
and apply division.
div <- expression$vehicle / expression$perturb #I'm not getting how I can pass here the value in `vehicle` and `perturb`
Assign this new variable a column name which should be a combination of drug_name
and concentration
col.names(div) <- drug_name_concentration
assign it the row.names of expression:
row.names(div) <- row.names(expression)
So this process will iterate 271 times (nrow(cont) = 271
) and every time a new divised column will be cbind
to my previous div
. Hence final outcome will be:
arachidonic acid_0.000010 oligomycin_0.000001 .........
1007_s_at 0.45 0.30
1053_at 1.34 0.65
117_at 0.11 0.67
.....
.....
The logic is clear in my head but I am not getting how I can do it. Thanks for your help.
Upvotes: 0
Views: 1374
Reputation: 1360
You are not assigning the variables correctly in the loop. Below is a sample loop that will correctly go over each row assigning the variable. e.g. the first loop i == 1, note I have changed how the column name is generated.
for(i in 1:nrow(cont)){
vehicle <- cont[i, 3]
perturb <- cont[i, 4]
col_name <- paste(cont[i, 5], cont[i, 6], sep = '_')
}
To then search for the respective columns with these variable names you can then use:
df[,which(colnames(df) == x)]
approach where df is you data frame and x is the variable.
Therefore,
div <- data.frame(row.names(expression))
for(i in 1:nrow(cont)){
vehicle <- cont[i, 3]
perturb <- cont[i, 4]
col_name <- paste(cont[i, 5], cont[i, 6], sep = '_')
tmp <- expression[,which(colnames(expression) == vehicle)]/
expression[,which(colnames(expression) == perturb)]
div <- cbind(div, tmp)
colnames(div)[i + 1] <- col_name
}
div <- div[,-1]
row.names(div) <- row.names(expression)
What is happening is it loops through each row, assigns the value to the variables before finding those columns and simply dividing by the resulting vectors.
It then binds by column to the div data frame created before the loop with the row names from table expression.
Finally, renames the column name and after completing the loop it then renames the row names and drops the first column with the now redundant values.
EDIT - question changed
change #1
vehicle <- cont[i, 5:9]
to
vehicle <- cont[i, c(5:9)] ## note c()
change #2
tmp <- sum(expression[,which(colnames(expression) == vehicle)])/5
to
tmp <- sum(expression[,which(colnames(expression) %in% vehicle)])/5
FINAL EDIT
Full working function:
for(i in 1:nrow(cont)){
perturb <- cont[i, 4]
col_name <- paste(cont[i, 2], cont[i, 3], sep = '_')
vehicle <- cont[i, c(5:9)]
vehicle <- unname(unlist(vehicle[1,]))
tmp <- expression[,which(colnames(expression) %in% vehicle)]
row_tots <- as.data.frame(rowSums(tmp))
row_tots <- row_tots/5
tmp <- row_tots/expression[,which(colnames(expression) == perturb)]
div <- cbind(div, tmp)
colnames(div)[i + 1] <- col_name
}
div <- div[,-1]
row.names(div) <- row.names(expression)
Upvotes: 1