Marcos Augusto
Marcos Augusto

Reputation: 51

How to join tables, keep columns and display NA values?

I am trying to do a full join with tables, keep the two colums and insert NA values when there is a unmatch:

d1 = tibble(
  Period1 = c('jan/2020', 'fev/2020', 'mar/2020', 'mar/2020', 'nov/2020')
)

d2 = tibble(
  Period2 = c('jan/2020', 'abr/2020', 'mar/2020', 'fev/2020', 'dez/2020')
)

full_join(d1,d2, by = c("Period1" = 'Period2'))

I get this ouput:

enter image description here

But I want this output:

# A tibble: 7 x 2
#  Period1  Period2 
#  <chr>    <chr>   
#1 jan/2020 jan/2020
#2 fev/2020 fev/2020
#3 mar/2020 mar/2020
#4 mar/2020 mar/2020
#5 nov/2020 <NA>    
#6 <NA>     abr/2020
#7 <NA>     dez/2020

Upvotes: 1

Views: 235

Answers (2)

linog
linog

Reputation: 6226

With data.table:

library(data.table)
dt <- merge(setDT(d1)[,Period := Period1],
            setDT(d2)[,Period := Period2],
            by.x = "Period1",
            by.y = "Period2", all = TRUE,
            suffixes = c("1",'2')
      )[,-1]

dt
    Period1  Period2
1:     <NA> abr/2020
2:     <NA> dez/2020
3: fev/2020 fev/2020
4: jan/2020 jan/2020
5: mar/2020 mar/2020
6: mar/2020 mar/2020
7: nov/2020     <NA>

You will get a warning since you create a column Period1 within the merge that is a key but you can ignore it ([,-1] is used to drop this temporary join key)

Upvotes: 2

akrun
akrun

Reputation: 887158

We could create a new column and then join

library(dplyr)
d1 %>%
   mutate(Period2 = Period1) %>% 
  full_join(d2 %>%
         mutate(Period1 = Period2), by = c("Period1" = "Period2")) %>%
  select(Period1 = Period2, Period2 = Period1.y)
# A tibble: 7 x 2
#  Period1  Period2 
#  <chr>    <chr>   
#1 jan/2020 jan/2020
#2 fev/2020 fev/2020
#3 mar/2020 mar/2020
#4 mar/2020 mar/2020
#5 nov/2020 <NA>    
#6 <NA>     abr/2020
#7 <NA>     dez/2020

Upvotes: 2

Related Questions