Reputation: 45

How to merge rows with duplicate ID, replacing NAs with data in the other row, and leading with data present in both duplicate rows?

I have a df like this:

data <- tribble(~id, ~othervar, ~it_1, ~it_2, ~it_3, ~it_4, ~it_5, ~it_6,
              "k01", "lum", "a", "b", "c", "a", NA, NA,
              "k01", "lum", NA, NA, NA, NA, "a", "d",
              "k02", "nar", "a", "b", "c", "b", NA, NA,
              "k03", "lum", "a", "b", "a", "c", NA, NA,
              "k03", "lum", "b", "b", "a", NA, "d", "e")

I want to merge rows with duplicated IDs in only one row where NAs are replaced with the information available in the other row. But where there are no-NA in both rows, the problem is to preserve any one. I´ve tried pivoting the table, but have no resources to deal with this.

i expect somthing like this:

id  othervar it_1 it_2 it_3 it_4 it_5 it_6
k01 lum      a    b    c    a    a    d
k02 nar      a    b    c    b    NA   NA
k03 lum      a    b    a    c    d    e

Upvotes: 3

Answers (2)

asd-tm

Reputation: 5263

Without ifelse with dplyr functions only:

data %>%
group_by(id) %>%
summarise(across(everything(), 
~coalesce(.x) %>% 
`[`(!is.na(.)) %>% 
`[`(1) ))

Upvotes: 1

Maël

Reputation: 51994

With ifelse and summarise:

library(dplyr)
data %>% 
  group_by(id) %>% 
  summarise(across(everything(), ~ ifelse(any(complete.cases(.x)),
                                          first(.x[!is.na(.x)]),
                                          NA)))

# A tibble: 3 × 8
  id    othervar it_1  it_2  it_3  it_4  it_5  it_6 
  <chr> <chr>    <chr> <chr> <chr> <chr> <chr> <chr>
1 k01   lum      a     b     c     a     a     d    
2 k02   nar      a     b     c     b     NA    NA   
3 k03   lum      a     b     a     c     d     e

Upvotes: 3

How to merge rows with duplicate ID, replacing NAs with data in the other row, and leading with data present in both duplicate rows?

Answers (2)

Related Questions