Justas Mundeikis
Justas Mundeikis

Reputation: 1005

cannot calculate mean() with dplyr()

I have the following code where I cannot calculate the last line of code (the average of top 5 values in the 17th column), which results in a warning message Warning message:In mean.default(df[1:5, 17], na.rm = TRUE) : argument is not numeric or logical: returning NA

Any suggestions what I'm doing wrong? Thanks!

library(eurostat)
library(dplyr)
library(tidyr)


earn_mw_avgr2 <- get_eurostat("earn_mw_avgr2", stringsAsFactors = FALSE)

df <- earn_mw_avgr2 %>% 
    filter(geo %in% c("BE","BG","CZ","DK","DE","EE","IE","EL","ES","FR","HR",
                   "IT","CY","LV","LU","HU","MT","NL","AT","PL","PT","RO",
                   "SI","SK","FI","SE","UK"),
        indic_se=="MW_MEAGE",
        nace_r2=="B-S") %>% 
    spread(time, values)%>%
    select(-c(17,18))%>%
    mutate(avg=rowMeans(.[14:16], na.rm = TRUE))%>%
    arrange(desc(avg))


mean(df[1:5,17], na.rm = TRUE) 

Upvotes: 0

Views: 428

Answers (2)

Noah Olsen
Noah Olsen

Reputation: 281

Adding $avg to df[1:5,17] fixes it. Full code below.

library(dplyr)
library(tidyr)


earn_mw_avgr2 <- get_eurostat("earn_mw_avgr2", stringsAsFactors = FALSE)

df <- earn_mw_avgr2 %>% 
  filter(geo %in% c("BE","BG","CZ","DK","DE","EE","IE","EL","ES","FR","HR",
                    "IT","CY","LV","LU","HU","MT","NL","AT","PL","PT","RO",
                    "SI","SK","FI","SE","UK"),
         indic_se=="MW_MEAGE",
         nace_r2=="B-S") %>% 
  spread(time, values)%>%
  select(-c(17,18))%>%
  mutate(avg=rowMeans(.[14:16], na.rm = TRUE))%>%
  arrange(desc(avg))


mean(df[1:5,17]$avg, na.rm = TRUE) 

Upvotes: 1

akrun
akrun

Reputation: 887951

Issue is that it is a tibble and it won't drop dimensions as data.frame (where the ?Extract would be drop = TRUE)

mean(df[1:5,][[17]])
#[1] 47.29667

mean works on vectors. If we check the ?mean

x - An R object. Currently there are methods for numeric/logical vectors and date, date-time and time interval objects. Complex vectors are allowed for trim = 0, only.

Upvotes: 4

Related Questions