Create data frame containing the row position of missing values

Question

I would like to create a table that contains the row position of missing values from an original data frame. This would essentially take the first table as input and create the table below that one.

I know that I can use apply in order to create a list with this row position but I am struggling to then take that list and make a dataframe.

# Minimum working example

# Create dataset
data0 <- data.frame("A" = c(NA,NA,1,1), "B"= c(1,NA,1,1),"C"= c("john","john",NA,NA),"D"= c("john","john","john","john"))

# Create list of all rows containing missing values for a particular column then print as dataframe

list1<-apply(is.na(data0), 2, which)

> print(list1)
$A
[1] 1 2

$B
[1] 2
$C
[1] 3 4
$D
integer(0)

# Turn list1 to a data.frame leading to answer

akrun · Accepted Answer

Loop over the list with sapply, assign the length to nrow of 'data0', to append NA at the end where there are less elements and wrap with data.frame

as.data.frame(sapply(list1, `length<-`, nrow(data0)))

-output

  A  B  C  D
1  1  2  3 NA
2  2 NA  4 NA
3 NA NA NA NA
4 NA NA NA NA

We could also do this as

library(dplyr)
data0 %>% 
    mutate(across(everything(), ~ replace(rep(NA_integer_, n()), 
         is.na(.), which(is.na(.)))[order(!is.na(.))]))
   A  B  C  D
1  1  2  3 NA
2  2 NA  4 NA
3 NA NA NA NA
4 NA NA NA NA

If we don't need to order the values, i.e. the positions remain in the same position of occcurence

NA^(!is.na(data0)) * row(data0)
      A  B  C  D
[1,]  1 NA NA NA
[2,]  2  2 NA NA
[3,] NA NA  3 NA
[4,] NA NA  4 NA

Create data frame containing the row position of missing values

Answers (2)

Related Questions