Reputation: 23
I have two data frames, and I want to transfer the latitude and longitude from one df to another. DF1 is missing most of the lat and long, but DF2 also has a few missing values in the lat and long tool. I want to match on their location id (loc_id), but not all rows in DF1 have a loc_id.
reg_id <- c(1,2,3,4,5,6,7)
loc_id <- c(NA,1,1,NA,2,3,4)
lat <- c(NA,12.56,12.56,NA,NA,13.757,NA)
long <- c(NA,-31.2,-31.2,NA,NA,-56.43,NA)
DF1 <- data.frame(reg_id,loc_id,lat,long)
loc_id <- c(1,2,3,4,5,60)
loc_lat <- c(12.56,20.7,13.757,NA,20.3,50.7)
loc_long <- c(-31.2,-47.3,-56.43,NA,-71.123,-27.1)
DF2 <- data.frame(loc_id,loc_lat,loc_long)
Goal is to end up with a df like this:
reg_id loc_id lat long
1 NA NA NA
2 1 12.56 -31.2
3 1 12.56 -31
4 NA NA NA
5 2 20.7 -47.3
6 3 13.757 -56.43
7 4 NA NA
Note: The lat and long are not the same lengths for each row.
Thank you for your help!
Just some background, I've tried match, setDT
, merge
, to no avail.
no_ll_DF1 <- which(is.na(DF1$lat) & !is.na(DF1$loc_id))
yes_ll_DF2 <- which(!is.na(DF2$loc_lat))
DF1$lat[match(DF2$loc_id[yes_ll_DF2], DF1$loc_id[no_ll_DF1])] <- DF2$loc_lat[yes_ll_DF2]
When using match, I would get this error:
Error in DF1$lat[match(DF2$loc_id[yes_ll_DF2], DF1$loc_id[no_ll_DF1])] <- DF2$loc_lat[yes_ll_DF2] : NAs are not allowed in subscripted assignments
Upvotes: 2
Views: 825
Reputation: 4284
One way to do this using the tidyverse
package
library(tidyverse)
# left join to keep all DF1 and add only matching loc_id in DF2
DF1 %>% left_join(DF2,by="loc_id") %>%
# replace missing lat and long if we have a match in DF2
mutate(lat=ifelse(is.na(lat),loc_lat,lat),
long=ifelse(is.na(long),loc_long,long)) %>%
# remove loc_lat and loc_long columns from dataframe
select(-loc_lat,-loc_long)
Upvotes: 1