Reputation: 67
I have a csv file with a data set of experimental values of many samples, and sometimes replicates of the same sample. For the replicates I only take into account the mean value of the replicates belonging to the same sample. The problem is, the number of replicates varies, it can be 2, 3, 4 etc...
My code isn't right, because it should be only working if replicates number is 2 (since I am using a loop to compare one sampleID to the previous sampleID in the loop). Plus, my code doesn't work, it adds the same average value to all my samples, which is not right. I think there is a problem at the start of the loop too. Because when x=1, x-1=0 which doesn't correspond to any value, so that may cause the code to not work? I am a beginner in R, I never had any courses or training I am training to learn it by myself, so thank you in advance for your help.
My dataset:
Expected output:
PS: in this example the replicates number is 2. However, it can be different depending on samples, sometimes its 2, sometimes 3, 4 etc...
for (x in length(dat$Sample)){
if (dat$Sample[x]==dat$Sample[x-1]){
dat$Average.OD[x-1] <- mean(dat$OD[x], dat$OD[x-1])
dat$Average.OD[x] <- NA
}
}
Upvotes: 0
Views: 875
Reputation: 141
Let me show you the possible solution by data.table.
#Data
data <- data.frame('Sample'=c('Blank','Blank','STD1','STD1'),
'OD'=c(0.07,0.08,0.09,0.10))
#Code
#Converting our data to data.table.
setDT(data)
#Finding the average of OD by Sample Column. Here Sample Column is the key.If you want it by both Sample and Replicates, pass both of them in by and so on.
data[, AverageOD := mean(OD), by = c("Sample")]
#Turning all the duplicate AverageOD values to NA.
data[duplicated(data, by = c("Sample")), AverageOD := NA]
#Turning column name of AverageOD to Average OD
names(data)[which(names(data) == "AverageOD")] = 'Average OD'
Let me know if you have any questions.
Upvotes: 1
Reputation: 37641
You can do this without any looping using aggregate
and merge
. Since you do not provide any data, I illustrate with a simple example.
## Example data
set.seed(123)
Sample = round(runif(10), 1)
OD = sample(4, 10, replace=T)
dat = data.frame(OD, Sample)
Means = aggregate(dat$Sample, list(dat$OD), mean, na.rm=T)
names(Means) = c("OD", "mean")
Means
OD mean
1 1 0.9000000
2 2 0.7000000
3 3 0.3666667
4 4 0.4000000
merge(dat, Means, "OD")
OD Sample mean
1 1 0.9 0.9000000
2 1 0.9 0.9000000
3 2 0.8 0.7000000
4 2 0.9 0.7000000
5 2 0.4 0.7000000
6 3 0.0 0.3666667
7 3 0.6 0.3666667
8 3 0.5 0.3666667
9 4 0.3 0.4000000
10 4 0.5 0.4000000
Upvotes: 1