States
States

Reputation: 163

Error when trying to calculate mean value, saying object doens't exist

I want to find out what the mean air_time is from the origins (EWR, JFK and LGA) in nycflights, but I'm getting an error saying that the object doesn't exist.

library(tidyverse)
library(nycflights13)

flights %>% select(air_time)     (doing this shows that the column exists and with values)

Now trying to calculate the mean like below, gives me an error

flights %>% select(mean(air_time))
Error: object 'air_time' not found
Run `rlang::last_error()` to see where the error occurred.

Trying to run: rlang::last_error() it just prints a confusing trace, and says the air_time does not exist, even though it does.

At first I thought maybe it's because the air_time has type dbl (double), and that I could not run mean(..) on a double, but trying out mean(1:10.4) yields 5.5, so that's not the case. Any help is welcomed a lot

Upvotes: 0

Views: 120

Answers (3)

Ben
Ben

Reputation: 30474

The function select will choose variables from your data frame. You can select air_time (a column name in your data frame) but not mean(air_time).

Instead, if you want the mean times for each of the origins, you can group_by origin first, and then summarise to get the means for each. Note since some have missing data (NA) you would need remove those to get a numeric mean instead of NA.

flights %>%
  group_by(origin) %>%
  summarise(mean_time = mean(air_time, na.rm = TRUE))

Output

# A tibble: 3 x 2
  origin mean_time
  <chr>      <dbl>
1 EWR         153.
2 JFK         178.
3 LGA         118.

Upvotes: 1

user1357015
user1357015

Reputation: 11686

I think you're not using dplyr correctly. You can't use select(mean(air_time)) because you're literally trying to select a mean column. What you want to do is the following:

flights %>% summarise(mean_air_time = mean(air_time, na.rm=TRUE))

Upvotes: 1

Erin Spr&#252;nken
Erin Spr&#252;nken

Reputation: 400

I am not an expert of tidyverse, but as far as I see the problem occurs in the select statement. Maybe try something to select first and run mean after selection. In this code, so it appears to me, you try to select the mean of something. If I use base R such as

A = flights$air_time
mean(A, na.rm = T)

I get a result.

Upvotes: 1

Related Questions