Reputation: 9018
Situation:
Here is the data I have:
> head(data1)
CHROM POS REF ALT DIFF GT
1 chr01 14653 C T 254 CT
2 chr01 14907 A G 254 AG
3 chr01 14930 A G 23 AG
4 chr01 15190 G A 260 GA
5 chr01 15211 T G 21 TG
6 chr01 16378 T C 1167 TC
> tail(data1)
154176 chrX 154901366 T A 58700 TA
154177 chrX 154901404 A T 38 AT
154178 chrX 154933406 A G 32002 AG
154179 chrX 154933419 A T 13 AT
154180 chrX 154933451 T C 32 TC
154181 chrX 154933473 G T 22 GT
What I want to do:
The code I have now can only get the mean value grouped by POS group but not CHROM group.
Code:
datsum <- ddply(data1, .var = "POSgroup", .fun = function(x) {
# Calculate the mean DIFF value for each GT group in this POSgroup
meandiff <- ddply(x, .var = "GT", .fun = summarise, ymean = mean(DIFF))
# Add the center of the POSgroup range as the x position
meandiff$center <- (x$POSgroup[1] * 1e7) + 0.5e7
# Return the results
meandiff
})
Can anyone help me with this?
Upvotes: 1
Views: 322
Reputation: 49448
Using data.table
, this will give you a starting point:
library(data.table)
dt = data.table(data1)
dt[, mean(DIFF), by = list(floor(CHROM/1e7), floor(POS/1e7))]
Upvotes: 3