Get mean of a single variable with dplyr

I'm using the diamonds dataset and I'm trying to find the mean price for each of the cuts. I thought this would work

diamonds_data %>%
  filter(Cut == 'Ideal') %>%
  mean(Price)

But I get the following warning message:

[1] NA
Warning message:
In mean.default(., diamonds_data, Price) :
  argument is not numeric or logical: returning NA

Upvotes: 1

Views: 3857

Answers (2)

Darren Tsai
Darren Tsai

Reputation: 35604

To simply make your code work, try mean(.$price).

diamonds %>%
  filter(cut == 'Ideal') %>%
  {mean(.$price)}

# [1] 3457.542

A better choice is computing the mean price for each cut all at once, and assign the summary table to an object.

price <- diamonds %>%
  group_by(cut) %>%
  summarise(mean_price = mean(price))

# # A tibble: 5 x 2
#   cut       mean_price
#   <ord>          <dbl>
# 1 Fair           4359.
# 2 Good           3929.
# 3 Very Good      3982.
# 4 Premium        4584.
# 5 Ideal          3458.

When you need some value, extract it from the table.

price$mean_price[price$cut == "Ideal"]

# [1] 3457.542

Upvotes: 1

Ric S
Ric S

Reputation: 9277

You cannot use mean as a function on a dataframe. If you want to get a numeric value starting from a column, use pull to extract that column from the dataframe.

diamonds_data %>% 
  filter(Cut == "Ideal") %>% 
  pull(Price) %>% 
  mean()
# [1] 3457.542

Upvotes: 5

Related Questions