Giora Simchoni
Giora Simchoni

Reputation: 3689

Augment sublists in a list of sublists with missing elements as NA

I have a list of lists (will call them "sublists" to avoid confusion) containing named elements. Not all sublists contain all named elements. I wish to augment sublists with missing elements as NA.

Example:

l <- list(list(a = 1, b = 2, c = 3),
  list(a = 4, b = 5, c = 6),
  list(a = 7, b = 8),
  list(a = 9, c = 10))

As can be seen, the 3rd and 4th sublists are missing the c and b elements respectively. I would like these elements to be augmented as NA to these sublists, i.e.:

res <- list(list(a = 1, b = 2, c = 3),
  list(a = 4, b = 5, c = 6),
  list(a = 7, b = 8, c = NA),
  list(a = 9, b = NA, c = 10))

In reality, if this makes it any easier, each sublist is missing only the last k elements (i.e. I do not have a situation as in the 4th sublist missing a middle element b), but I feel like while we're at it, let's find a generic solution.

UPDATE: Got 3 great solutions for this specific scenario, where the sublists elements are ints. But the elements can be chrs, or even lists! E.g.:

l <- list(list(a = list(1,2), b = 2, c = 3),
      list(b = 5, c = 6),
      list(a = list(5,6), b = 8),
      list(a = list(7,8), c = 10))

The a element is a list and should stay that way in the res list. If it is missing, I would like an NA, as usual:

res <- list(list(a = list(1,2), b = 2, c = 3),
  list(a = NA, b = 5, c = 6),
  list(a = list(5,6), b = 8, c = NA),
  list(a = list(7,8), b = NA, c = 10))

Upvotes: 3

Views: 351

Answers (2)

zx8754
zx8754

Reputation: 56219

Update: We can make unique names, then loop through the lists and subset those names. Names that are not in the list will return NULL, those we will assign with NA. This should work for all inputs.

# data
l <- list(list(a = list(1,2), b = 2, c = 3),
      list(b = 5, c = 6),
      list(a = list(5,6), b = 8),
      list(a = list(7,8), c = 10))

myNames <- unique(unlist(sapply(l, names)))

res <- lapply(l, function(i){
  x2 <- lapply(myNames, function(j){
    x1 <- i[[ j ]]
    if(is.null(x1)){ x1 <- NA}
    x1
    })
  names(x2) <- myNames
  x2
})

# check results
identical(res,
          #expected output
          list(list(a = list(1,2), b = 2, c = 3),
               list(a = NA, b = 5, c = 6),
               list(a = list(5,6), b = 8, c = NA),
               list(a = list(7,8), b = NA, c = 10)))
# [1] TRUE

Original: We can treat sublist as dataframe and rbind with fill on missing columns, then split again:

# data:
l <- list(list(a = list(1,2), b = 2, c = 3),
          list(a = list(3,4), b = 5, c = 6),
          list(a = list(5,6), b = 8),
          list(a = list(7,8), c = 10))

library(dplyr)

# convert to dataframe and rbind with fill on missing columns
x <- bind_rows(lapply(l, as_data_frame))

# then convert it back to list
res <- lapply(split(x, seq(nrow(x))), as.list)

# drop names, we can skip this step if we want to keep names as 1,2,3,4...
names(res) <- NULL

# result
res

# [[1]]
# [[1]]$a
# [1] 1
# 
# [[1]]$b
# [1] 2
# 
# [[1]]$c
# [1] 3
# 
# 
# [[2]]
# [[2]]$a
# [1] 4
# 
# [[2]]$b
# [1] 5
# 
# [[2]]$c
# [1] 6
# 
# 
# [[3]]
# [[3]]$a
# [1] 7
# 
# [[3]]$b
# [1] 8
# 
# [[3]]$c
# [1] NA
# 
# 
# [[4]]
# [[4]]$a
# [1] 9
# 
# [[4]]$b
# [1] NA
# 
# [[4]]$c
# [1] 10

Upvotes: 2

storm surge
storm surge

Reputation: 841

Surely there is a better way to do it, but this works with both examples.

res and res2 are the example results you have provided.

l.res and l2.res are the results from the code.

l <- list(list(a = 1, b = 2, c = 3),
          list(a = 4, b = 5, c = 6),
          list(a = 7, b = 8),
          list(a = 9, c = 10))

res <- list(list(a = 1, b = 2, c = 3),
             list(a = 4, b = 5, c = 6),
             list(a = 7, b = 8, c = NA),
             list(a = 9, b = NA, c = 10))

l2 <- list(list(a = list(1,2), b = 2, c = 3),
          list(b = 5, c = 6),
          list(a = list(5,6), b = 8),
          list(a = list(7,8), c = 10))
res2 <- list(list(a = list(1,2), b = 2, c = 3),
            list(a = NA, b = 5, c = 6),
            list(a = list(5,6), b = 8, c = NA),
            list(a = list(7,8), b = NA, c = 10))


#vector with 'column names' to be checked

aux=c("a","b","c")

#function  that check if all sublists have all the elements
#if not, create the element and asign NA value
myfunction<-function(l.list,n.names){

  for(i in 1:length(l.list)){
    for(j in 1:length(n.names)){
      if (n.names[j] %in% names(l.list[[i]]) == FALSE) {
        l.list[[i]][n.names[j]]<-NA
        l.list[[i]]=l.list[[i]][order(unlist(names(l.list[[i]])))]
       }
     }
    }

  return(l.list)
}

#Applying to example 1
l.res<-myfunction(l,aux)

data.frame(l.res) #as a data frame just for comparison purpose
##   a b c a.1 b.1 c.1 a.2 b.2 c.2 a.3 b.3 c.3
## 1 1 2 3   4   5   6   7   8  NA   9  NA  10
data.frame(res)
##   a b c a.1 b.1 c.1 a.2 b.2 c.2 a.3 b.3 c.3
## 1 1 2 3   4   5   6   7   8  NA   9  NA  10


#Applying to example 2
l2.res<-myfunction(l2,aux)

data.frame(l2.res) #as a data frame just for comparison purpose
##   a.1 a.2 b c  a b.1 c.1 a.5 a.6 b.2 c.2 a.7 a.8 b.3 c.3
## 1   1   2 2 3 NA   5   6   5   6   8  NA   7   8  NA  10
data.frame(res2)
##   a.1 a.2 b c  a b.1 c.1 a.5 a.6 b.2 c.2 a.7 a.8 b.3 c.3
## 1   1   2 2 3 NA   5   6   5   6   8  NA   7   8  NA  10

Hope it helps.

Upvotes: 0

Related Questions