user849541
user849541

Reputation: 176

Create data frame containing the row position of missing values

I would like to create a table that contains the row position of missing values from an original data frame. This would essentially take the first table as input and create the table below that one.

enter image description here

I know that I can use apply in order to create a list with this row position but I am struggling to then take that list and make a dataframe.

# Minimum working example

# Create dataset
data0 <- data.frame("A" = c(NA,NA,1,1), "B"= c(1,NA,1,1),"C"= c("john","john",NA,NA),"D"= c("john","john","john","john"))

# Create list of all rows containing missing values for a particular column then print as dataframe

list1<-apply(is.na(data0), 2, which)

> print(list1)
$A
[1] 1 2

$B
[1] 2
$C
[1] 3 4
$D
integer(0)

# Turn list1 to a data.frame leading to answer

Upvotes: 2

Views: 108

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388817

Using sapply and starting from data0 you can do -

sapply(data0, function(x) which(is.na(x))[seq_along(x)])

#      A  B  C  D
#[1,]  1  2  3 NA
#[2,]  2 NA  4 NA
#[3,] NA NA NA NA
#[4,] NA NA NA NA

Upvotes: 1

akrun
akrun

Reputation: 886948

Loop over the list with sapply, assign the length to nrow of 'data0', to append NA at the end where there are less elements and wrap with data.frame

as.data.frame(sapply(list1, `length<-`, nrow(data0)))

-output

  A  B  C  D
1  1  2  3 NA
2  2 NA  4 NA
3 NA NA NA NA
4 NA NA NA NA

We could also do this as

library(dplyr)
data0 %>% 
    mutate(across(everything(), ~ replace(rep(NA_integer_, n()), 
         is.na(.), which(is.na(.)))[order(!is.na(.))]))
   A  B  C  D
1  1  2  3 NA
2  2 NA  4 NA
3 NA NA NA NA
4 NA NA NA NA

If we don't need to order the values, i.e. the positions remain in the same position of occcurence

NA^(!is.na(data0)) * row(data0)
      A  B  C  D
[1,]  1 NA NA NA
[2,]  2  2 NA NA
[3,] NA NA  3 NA
[4,] NA NA  4 NA

Upvotes: 1

Related Questions