Erik
Erik

Reputation: 29

How to join two tables to find the missing rows?

I have a list of samples and a second list of the same samples, plus their values. In the second one, some rows are missing; thus, I want to compare the two tables to find which one is missing and fill it with NA. Each sample is in triplicates. The tables look like the following ones:

Table A                                 Table B
------------------------                ----------------------------------
Name     treat.      n                  Name     treat.      n       Value
AA         H         1                  AA         H         1         x1
AA         H         1                  AA         H         1         x2
AA         H         1                  AA         H         1         x3
BB         C         2                  BB         C         2         y1
BB         C         2                  BB         C         2         y2
BB         C         2                  CC         H         3         z1
CC         H         3                  
CC         H         3
CC         H         3

What I would like to have is something like this. Can you help me to write the code? Thank you very much for your help.

Name     treat.      n     Value    
----------------------------------              
AA         H         1       x1           
AA         H         1       x2           
AA         H         1       x3           
BB         C         2       y1           
BB         C         2       y2           
BB         C         2       NA           
CC         H         3       z1           
CC         H         3       NA
CC         H         3       NA

Upvotes: 0

Views: 453

Answers (1)

zephryl
zephryl

Reputation: 17069

You can add a unique “observation ID” by grouping by name and using dplyr::row_number(). Then merge your tables, with observation id and other shared columns as keys.

library(dplyr)

table1 <- table1 %>%
  group_by(name) %>%
  mutate(obs = row_number()) %>%
  ungroup()

table2 <- table2 %>%
  group_by(name) %>%
  mutate(obs = row_number()) %>%
  ungroup()
  
left_join(table1, table2)

Output:

# A tibble: 9 × 5
  name  treat     n   obs value
      
1 AA    H         1     1 x1   
2 AA    H         1     2 x2   
3 AA    H         1     3 x3   
4 BB    C         2     1 y1   
5 BB    C         2     2 y2   
6 BB    C         2     3 NA
7 CC    H         3     1 z1   
8 CC    H         3     2 NA
9 CC    H         3     3 NA

Upvotes: 1

Related Questions