user3227641
user3227641

Reputation: 177

How can I fill missing values based on a condition in R

I'm trying to fill some missing values for rows that meet certain conditions from another column. My data is given below. For China for the years 2002 and 2003, I want to copy values from the column named "manu_GDP_old" into the column "manu_GDP". In other words, I'm trying to fill the missing values of "manu_GDP" for China from the column "manu_GDP_old".

I would ideally like to do this using the dplyr package.

Thanks in advance.

df <- structure(list(country = c("Brazil", "Brazil", "Brazil", "Brazil", 
                           "Brazil", "China", "China", "China", "China", "China"), year = c(2002, 
                                                                                            2003, 2004, 2005, 2006, 2002, 2003, 2004, 2005, 2006), manu_GDP = c(12.3569626659174, 
                                                                                                                                                                14.4507645634139, 15.0995301566951, 14.7382811350657, 14.108945871671, 
                                                                                                                                                                NA, NA, 31.9750699702633, 32.0939243286777, 32.4523280565943), 
               manu_GDP_old = c(NA, NA, NA, NA, NA, "31.1", "32.5", "32.0", 
                                "32.1", "32.5")), row.names = c(NA, -10L), class = c("tbl_df", 
                                                                                     "tbl", "data.frame"))



Upvotes: 1

Views: 1785

Answers (2)

AlexB
AlexB

Reputation: 3269

One way would be:

df %>%
  mutate(manu_GDP = ifelse(is.na(manu_GDP), manu_GDP_old, manu_GDP))

or

df %>%
  mutate(manu_GDP = na_if(manu_GDP, manu_GDP_old))

in case you want to check at country level:

df %>%
  mutate(manu_GDP = ifelse(is.na(manu_GDP) & country == 'China',
                           manu_GDP_old,
                           manu_GDP))

Upvotes: 1

Rory S
Rory S

Reputation: 1298

dplyr Method

df %>%  
    mutate(manu_GDP = case_when(country == "China" & is.na(manu_GDP) ~ manu_GDP_old,
                                TRUE ~ as.character(manu_GDP)))

Base R Method

tf <- df$country == "China" & is.na(df$manu_GDP)
df$manu_GDP[tf] <- df$manu_GDP_old[tf]

Upvotes: 2

Related Questions