wickedpanda
wickedpanda

Reputation: 17

R : Filling in missing values in a column based on other columns

I have a large data set where each zipcodes have their corresponding latitude and longitude. In the data set some zipcodes are missing. I need to fill in the missing zipcodes on the basis of their corresponding lat long where that data is not missing. In this example I would want rows 4 and 5 to have a and b inserted for zipcode as they have the same lat and long:

zipcode <- c("a","b","c","","")
lat <- c("1","2","3","1","2")
lon <- c("6","7","8","6","7")
data.frame(zipcode,lat,lon)
  zipcode lat lon
1       a   1   6
2       b   2   7
3       c   3   8
4           1   6
5           2   7

I'd prefer to not install another package unless really necessary.

Thank you

Upvotes: 0

Views: 2312

Answers (1)

sm925
sm925

Reputation: 2678

Using na_if from dplyr to replace blank values with NA in zipcode column and then use fill from tidyr:

library(dplyr)
library(tidyr)
df %>%
    group_by(lat, lon) %>% 
    mutate(zipcode = na_if(zipcode, "")) %>% 
    fill(zipcode)

#   zipcode lat   lon  
   #<fct>   <fct> <fct>
   #1 a       1     6    
   #2 b       2     7    
   #3 c       3     8    
   #4 a       1     6    
   #5 b       2     7 

Upvotes: 1

Related Questions