Reputation: 1704
I have a data frame and I want to get the mean of all values of type b
for each year, if type a
have values equal to 1.
Year type value1 value2 value3 value4 value5
1 a 1 1 2 3 4
1 b 10 12 9 8 10
2 a 1 2 2 2 1
2 b 11 10 13 9 14
so that my final product looks like this:
Year type_b_values
1 11
2 12.5
which are the averages of value1
and value2
for Year1
and average of value1
and 5
for Year2
. Thanks!
Upvotes: 1
Views: 296
Reputation: 89057
And here is the version using plyr
:
library(plyr)
ddply(dat, "Year", function(x) {
values.cols <- grep("value", names(x), value = TRUE)
a <- subset(x, type == "a", values.cols)
b <- subset(x, type == "b", values.cols)
c("type_b_values" = mean(b[a == 1]))
})
# Year type_b_values
# 1 1 11.0
# 2 2 12.5
Upvotes: 1
Reputation: 109844
Here is an approach using base functions. I'm guessing plyr or reshape may be useful packages here as well but I'm much less familiar with them:
dat <- read.table(text="Year type value1 value2 value3 value4 value5
1 a 1 1 2 3 4
1 b 10 12 9 8 10
2 a 1 2 2 2 1
2 b 11 10 13 9 14", header=TRUE)
dat_split <- split(dat, dat$Year) # split our data into a list by year
output <- sapply(dat_split, function(x) {
y <- x[x$type == "a", -c(1:2)] == 1 # which a in that year = 1
z <- x[x$type == "b", -c(1:2)][y] # grab the b values that a = 1
if (sum(y) == 0) { # eliminate if no a = 1
return(NA)
}
mean(z)
})
data.frame(Year = names(output), type_b_values = output)
## > data.frame(Year = names(output), type_b_values = output)
## Year type_b_values
## 1 1 11.0
## 2 2 12.5
Upvotes: 3