Reputation: 2584
I have a data like this
df<- structure(list(X1 = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), X2 = structure(c(1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L,
16L, 17L, 18L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L,
12L, 13L, 14L, 15L, 16L, 17L, 18L), .Label = c("B02", "B03",
"B04", "B05", "B06", "B07", "C02", "C03", "C04", "C05", "C06",
"C07", "D02", "D03", "D04", "D05", "D06", "D07"), class = "factor"),
X3 = c(1146L, 1246L, 1624L, 650L, 874L, 1267L, 1951L, 1214L,
1275L, 1398L, 1199L, 1268L, 849L, 1268L, 1212L, 835L, 1246L,
1188L, 1172L, 1164L, 1251L, 845L, 1167L, 1113L, 818L, 1125L,
1130L, 1186L, 1221L, 1176L, 1223L, 877L, 1253L, 1253L, 1253L,
1253L)), .Names = c("X1", "X2", "X3"), class = "data.frame", row.names = c(NA,
-36L))
I am trying to get average of specific rows
B02 to B05 when the X1 is 1 (called DF1)
1 B02 1146
1 B03 1246
1 B04 1624
1 B05 650
B02 to B05 when the X1 is 2 (called DF2)
2 B02 1172
2 B03 1164
2 B04 1251
2 B05 845
Then take the average of C02 and D02 when the X1 is 1 (called DF3)
1 C02 1951
1 D02 849
Then take average of C02 and D02 when the X1 is 2 (called DF4)
2 C02 818
2 D02 1223
and substrate the average values DF1-DF3 and DF2-DF4
Upvotes: 0
Views: 88
Reputation: 4534
If I'm understanding correctly, I think the process can be slightly simplified as the same process is being repeated for each group in X1
. Building on the existing answer by @BWilliams, we could do:
library(dplyr)
library(tidyr)
d12 <- c("B02", "B03", "B04", "B05")
d34 <- c("C02", "D02")
result <- df %>%
mutate(DF = case_when(X2 %in% d12 ~ 'DF12',
X2 %in% d34 ~ "DF34")) %>%
drop_na %>%
group_by(X1, DF) %>%
summarise(means = mean(X3), SD = sd(X3))
result
#> # A tibble: 4 x 4
#> # Groups: X1 [?]
#> X1 DF means SD
#> <int> <chr> <dbl> <dbl>
#> 1 1 DF12 1166.5 401.1729
#> 2 1 DF34 1400.0 779.2317
#> 3 2 DF12 1108.0 179.6756
#> 4 2 DF34 1020.5 286.3782
The final mean difference can then be obtained by subtracting the last
mean from the first
result %>% summarise(mean_diff = first(means) - last(means))
#> # A tibble: 2 x 2
#> X1 mean_diff
#> <int> <dbl>
#> 1 1 -233.5
#> 2 2 87.5
Upvotes: 1
Reputation: 2050
Here is a tidyverse
version
library(tidyverse)
Create "lookup tables"
d1 <- c("B02", "B03", "B04", "B05")
d3 <- c("C02", "D02")
Use case_when
to designate the rows
df %>%
mutate(DF = case_when(X1==1 & X2 %in% d1 ~ 'DF1',
X1==2 & X2 %in% d1 ~ 'DF2',
X1==1 & X2 %in% d3 ~ "DF3",
X1==2 & X2 %in% d3 ~ "DF4")) %>%
drop_na %>%
group_by(DF) %>%
summarise(means = mean(X3)) %>%
summarise(DF13 = means[3]-means[1],
DF24 = means[2]-means[4])
You could get the SD by using mutate instead of the last summarise
summarise(means = mean(X3), sd = sd(X3)) %>%
mutate(DF13 = means[3]-means[1],
DF24 = means[2]-means[4],
SD = sd)
Upvotes: 1