pun11
pun11

Reputation: 157

Dplyr select() - need numeric data issue

I ran into an interesting problem when computing mean/median of a subset of a column in a data frame given a condition.

At this point, I don't need advice, but thought this problem would be interesting to share:

testFile <- data.frame(
    ID = 1:10, Value = rnorm(10), 
    Condition = sample(c("Yes", "No"), 10, replace = T)
)
median1 <- median(testFile[testFile$Condition == "Yes", ]$Value)
median2 <- testFile %>% 
    filter(Condition == "Yes") %>% 
    select(Value) %>% 
    median

Error in median.default(.) : need numeric data

However, the following code is working:

max2 <- testFile %>% 
    filter(Condition == "Yes") %>% 
    select(Value) %>% 
    max  

values <- testFile %>% 
    filter(Condition == "Yes") %>% 
    select(Value)

class(values[2,1])

[1] "numeric"

Upvotes: 2

Views: 1175

Answers (1)

pun11
pun11

Reputation: 157

Solution from the dplyr result using unlist:

median3 <- testFile %>% filter(Condition == "Yes") %>% select(Value) %>% unlist %>% median

Non-dplyr solution, as was already mentioined in the OP:

> median1 <- median(testFile[testFile$Condition == "Yes", ]$Value) # Working

Upvotes: 3

Related Questions