Reputation: 157
I ran into an interesting problem when computing mean/median of a subset of a column in a data frame given a condition.
At this point, I don't need advice, but thought this problem would be interesting to share:
testFile <- data.frame(
ID = 1:10, Value = rnorm(10),
Condition = sample(c("Yes", "No"), 10, replace = T)
)
median1 <- median(testFile[testFile$Condition == "Yes", ]$Value)
median2 <- testFile %>%
filter(Condition == "Yes") %>%
select(Value) %>%
median
Error in median.default(.) : need numeric data
However, the following code is working:
max2 <- testFile %>%
filter(Condition == "Yes") %>%
select(Value) %>%
max
values <- testFile %>%
filter(Condition == "Yes") %>%
select(Value)
class(values[2,1])
[1] "numeric"
Upvotes: 2
Views: 1175
Reputation: 157
Solution from the dplyr result using unlist:
median3 <- testFile %>% filter(Condition == "Yes") %>% select(Value) %>% unlist %>% median
Non-dplyr solution, as was already mentioined in the OP:
> median1 <- median(testFile[testFile$Condition == "Yes", ]$Value) # Working
Upvotes: 3