Reputation: 29
I have a list of samples and a second list of the same samples, plus their values. In the second one, some rows are missing; thus, I want to compare the two tables to find which one is missing and fill it with NA. Each sample is in triplicates. The tables look like the following ones:
Table A Table B
------------------------ ----------------------------------
Name treat. n Name treat. n Value
AA H 1 AA H 1 x1
AA H 1 AA H 1 x2
AA H 1 AA H 1 x3
BB C 2 BB C 2 y1
BB C 2 BB C 2 y2
BB C 2 CC H 3 z1
CC H 3
CC H 3
CC H 3
What I would like to have is something like this. Can you help me to write the code? Thank you very much for your help.
Name treat. n Value
----------------------------------
AA H 1 x1
AA H 1 x2
AA H 1 x3
BB C 2 y1
BB C 2 y2
BB C 2 NA
CC H 3 z1
CC H 3 NA
CC H 3 NA
Upvotes: 0
Views: 453
Reputation: 17069
You can add a unique “observation ID” by grouping by name
and using dplyr::row_number()
. Then merge your tables, with observation id and other shared columns as keys.
library(dplyr)
table1 <- table1 %>%
group_by(name) %>%
mutate(obs = row_number()) %>%
ungroup()
table2 <- table2 %>%
group_by(name) %>%
mutate(obs = row_number()) %>%
ungroup()
left_join(table1, table2)
Output:
# A tibble: 9 × 5
name treat n obs value
1 AA H 1 1 x1
2 AA H 1 2 x2
3 AA H 1 3 x3
4 BB C 2 1 y1
5 BB C 2 2 y2
6 BB C 2 3 NA
7 CC H 3 1 z1
8 CC H 3 2 NA
9 CC H 3 3 NA
Upvotes: 1