J. Doe.
J. Doe.

Reputation: 1305

R collapse and auto fill blanks in rows

I have the following example dataset:

ID = c(123,123)
NAmer = c("ABC","ABC")
field1 =  c(1,NA)
field2 =  c(NA,2)
field3 = c(NA,NA)
field4 = c(NA,NA)
field5 = c(NA,NA)
IHave <- data.frame(ID,NAmer,field1,field2,field3,field4,field5)
Iwant <- c(123,"ABC",1,2,NA,NA,NA)

How do I go from IHave to Iwant using data.table or tidyverse?

In practice I have some 000 rows.

Upvotes: 2

Views: 283

Answers (3)

IceCreamToucan
IceCreamToucan

Reputation: 28695

You can map over the columns, reduce ing each of them with the coalesce function. The function map_dfc is used instead of map so that it c-binds them together as a data frame (df). So the first column is reduce(IHave[[1]], coalesce), etc.

library(tidyverse)

map_dfc(IHave, reduce, coalesce)
# # A tibble: 1 x 7
#      ID NAmer field1 field2 field3 field4 field5
#   <dbl> <fct>  <dbl>  <dbl> <lgl>  <lgl>  <lgl> 
# 1   123 ABC        1      2 NA     NA     NA  

If you want to do this separately for each ID, NAmer group, you can use the code below instead.

IHave %>% 
  group_by(ID, NAmer) %>% 
  summarise_all(reduce, coalesce)

Upvotes: 3

akrun
akrun

Reputation: 887431

Based on the example, after grouping by 'ID', 'NAmer', keep only the non-NA elements if there is atleast one non-NA element

library(dplyr)
IHave %>% 
   group_by(ID, NAmer) %>%
   summarise_all(list(~ if(all(is.na(.))) NA else .[!is.na(.)]))
# A tibble: 1 x 7
# Groups:   ID [1]
#    ID NAmer field1 field2 field3 field4 field5
#  <dbl> <fct>  <dbl>  <dbl> <lgl>  <lgl>  <lgl> 
#1   123 ABC        1      2 NA     NA     NA   

Or using group by coalesce

IHave %>%
    group_by(ID, NAmer) %>% 
    summarise_all(list(~ coalesce(!!! .))) 
# A tibble: 1 x 7
# Groups:   ID [1]
#    ID NAmer field1 field2 field3 field4 field5
#  <dbl> <fct>  <dbl>  <dbl> <lgl>  <lgl>  <lgl> 
#1   123 ABC        1      2 NA     NA     NA    

Or the same logic in data.table

library(data.table)
setDT(IHave)[, lapply(.SD, function(x) coalesce(!!! x)), .(ID, NAmer)]
#    ID NAmer field1 field2 field3 field4 field5
#1: 123   ABC      1      2     NA     NA     NA

Upvotes: 3

markus
markus

Reputation: 26343

You could transpose your data and use coalesce from dplyr

do.call(dplyr::coalesce, as.data.frame(t(IHave), stringsAsFactors = FALSE))
# [1] "123" "ABC" " 1"  " 2"  NA    NA    NA 

The result is a character vector.

Upvotes: 2

Related Questions