Allycorn
Allycorn

Reputation: 3

How to subset a dataframe in R with multiple conditions?

Need to find the average fertility in Russia after 2004 year in preinstalled dataset gapminder.

library(dplyr)
library(dslabs)

df1 <- data.frame(gapminder)


a <- df1@year >= 2004
df1[df1$fertility %in% c("Russia", a), ]

This code returns NAs only. I tried different variations and watched few lectures, but could not figure it out, would appreciate your help.

Upvotes: 0

Views: 46

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388797

You can reference the column using $ (and not @), also 'Russia' is present in country column but you are checking in fertility.

Try :

library(dplyr)
df1 %>%
  filter(country == 'Russia', year >= 2004) %>%
  summarise(avg_fertility = mean(fertility, na.rm = TRUE))

#  avg_fertility
#1      1.493333

Without using filter

df1 %>%
  summarise(avg_fertility = mean(fertility[country == 'Russia' & 
                                           year >= 2004], na.rm = TRUE))

and in base R :

mean(subset(df1, country == 'Russia' & year >= 2004)$fertility, na.rm = TRUE)

Upvotes: 1

Related Questions