Reputation: 2584
I have a data like this
df<- structure(list(X1 = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), X2 = structure(c(1L, 2L, 3L,
4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L,
18L, 19L, 20L, 21L, 22L, 23L, 24L, 7L, 8L, 1L, 2L, 3L, 4L, 5L,
6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L,
19L, 20L, 21L, 22L, 23L, 24L), .Label = c("B02", "B03", "B04",
"B05", "B06", "B07", "C02", "C03", "C04", "C05", "C06", "C07",
"D02", "D03", "D04", "D05", "D06", "D07", "G02", "G03", "G04",
"G05", "G06", "G07"), class = "factor"), X3 = c(0.005648642,
0.005876389, 0.00592532, 0.006244456, 0.005987075, 0.006075874,
0.006198667, 0.006003758, 0.006041885, 0.006186987, 0.006041323,
0.006071594, 0.005902391, 0.005976096, 0.00593805, 0.005866524,
0.0059831, 0.005902586, 0.005914309, 0.005887304, 0.006054509,
0.005931266, 0.005936195, 0.005895191, 0.005840959, 0.005849247,
0.005808851, 0.005833586, 0.005825153, 0.00584873, 0.005983976,
0.00598669, 0.006011548, 0.005997747, 0.005851022, 0.005919044,
0.005854566, 0.0058226, 0.00578052, 0.005784874, 0.005933198,
0.005996407, 0.005898848, 0.00595775, 0.005918857, 0.005882898,
0.005877808, 0.005803604, 0.006235161, 0.005808725)), .Names = c("X1",
"X2", "X3"), class = "data.frame", row.names = c(NA, -50L))
I am trying to get the average of several numbers and then minus it from every single number in that data and then get average of specific numbers
Here is what I did
I first try to get the average of "G05", "G06", "G07" for each set (X1) Then I minus it from each value
df2 <- df1 %>%
filter(X2 %in% paste0(paste0("G0", 5:7)) %>%
group_by(X1) %>%
summarise_at(vars(-X2), funs(mean(.)))
which should give me two numbers for group 1 and group 2( based on X1)
mean(c(0.005931266,0.005936195,0.005895191)) [1] 0.005920884
mean(c(0.005803604,0.006235161,0.005808725)) [1] 0.005949163
Then I want to remove this value from each number in group 1 and group 2 based on their group
for example 0.005648642- 0.005920884 . . . . 0.005840959- 0.005949163
In simple words
1- We get the mean of G05 , G06 and G07 for two groups where X1 is 1 or 2
for example
mean(c(0.005931266,0.005936195,0.005895191)) [1] 0.005920884
mean(c(0.005803604,0.006235161,0.005808725)) [1] 0.005949163
2- We remove these mean values from every single number for example
0.005648642- 0.005920884
.
.
.
.
0.005840959- 0.005949163
3- After this correction Then I want to take avererge of specific rows which will be for both groups
For example
B02 and B03 for both group
average(c(0.005648642- 0.005920884,0.005876389- 0.005920884))
and
average(c(0.005808851- 0.005949163,0.005833586 - 0.005949163))
Upvotes: 0
Views: 103
Reputation: 50678
I this what you're after?
Steps 1 and 2:
Split on X1
(i.e. group by X1
) and center the values in X3
based on the mean across G05
, G06
, G07
:
lst <- lapply(split(df, df$X1), function(w) {
w.G0567 <- subset(w, grepl("G0[567]", w$X2));
print(mean(w.G0567$X3));
w$X3 <- w$X3 - mean(w.G0567$X3);
return(w);
})
#[1] 0.005920884
#[1] 0.005949163
lst;
#$`1`
# X1 X2 X3
#1 1 B02 -0.000272242
#2 1 B03 -0.000044495
#3 1 B04 0.000004436
#4 1 B05 0.000323572
#5 1 B06 0.000066191
#6 1 B07 0.000154990
#7 1 C02 0.000277783
#8 1 C03 0.000082874
#9 1 C04 0.000121001
#10 1 C05 0.000266103
#11 1 C06 0.000120439
#12 1 C07 0.000150710
#13 1 D02 -0.000018493
#14 1 D03 0.000055212
#15 1 D04 0.000017166
#16 1 D05 -0.000054360
#17 1 D06 0.000062216
#18 1 D07 -0.000018298
#19 1 G02 -0.000006575
#20 1 G03 -0.000033580
#21 1 G04 0.000133625
#22 1 G05 0.000010382
#23 1 G06 0.000015311
#24 1 G07 -0.000025693
#
#$`2`
# X1 X2 X3
#25 2 C02 -1.082043e-04
#26 2 C03 -9.991633e-05
#27 2 B02 -1.403123e-04
#28 2 B03 -1.155773e-04
#29 2 B04 -1.240103e-04
#30 2 B05 -1.004333e-04
#31 2 B06 3.481267e-05
#32 2 B07 3.752667e-05
#33 2 C02 6.238467e-05
#34 2 C03 4.858367e-05
#35 2 C04 -9.814133e-05
#36 2 C05 -3.011933e-05
#37 2 C06 -9.459733e-05
#38 2 C07 -1.265633e-04
#39 2 D02 -1.686433e-04
#40 2 D03 -1.642893e-04
#41 2 D04 -1.596533e-05
#42 2 D05 4.724367e-05
#43 2 D06 -5.031533e-05
#44 2 D07 8.586667e-06
#45 2 G02 -3.030633e-05
#46 2 G03 -6.626533e-05
#47 2 G04 -7.135533e-05
#48 2 G05 -1.455593e-04
#49 2 G06 2.859977e-04
#50 2 G07 -1.404383e-04
Step 3
For every group, average centred X3
values for B02
and B03
.
lapply(lst, function(w) mean(subset(w, X2 %in% c("B03", "B03"))$X3))
#$`1`
#[1] -4.4495e-05
#
#$`2`
#[1] -0.0001155773
Upvotes: 1