Reputation: 176
I would like to create a table that contains the row position of missing values from an original data frame. This would essentially take the first table as input and create the table below that one.
I know that I can use apply in order to create a list with this row position but I am struggling to then take that list and make a dataframe.
# Minimum working example
# Create dataset
data0 <- data.frame("A" = c(NA,NA,1,1), "B"= c(1,NA,1,1),"C"= c("john","john",NA,NA),"D"= c("john","john","john","john"))
# Create list of all rows containing missing values for a particular column then print as dataframe
list1<-apply(is.na(data0), 2, which)
> print(list1)
$A
[1] 1 2
$B
[1] 2
$C
[1] 3 4
$D
integer(0)
# Turn list1 to a data.frame leading to answer
Upvotes: 2
Views: 108
Reputation: 388817
Using sapply
and starting from data0
you can do -
sapply(data0, function(x) which(is.na(x))[seq_along(x)])
# A B C D
#[1,] 1 2 3 NA
#[2,] 2 NA 4 NA
#[3,] NA NA NA NA
#[4,] NA NA NA NA
Upvotes: 1
Reputation: 886948
Loop over the list
with sapply
, assign the length
to nrow
of 'data0', to append NA
at the end where there are less elements and wrap with data.frame
as.data.frame(sapply(list1, `length<-`, nrow(data0)))
-output
A B C D
1 1 2 3 NA
2 2 NA 4 NA
3 NA NA NA NA
4 NA NA NA NA
We could also do this as
library(dplyr)
data0 %>%
mutate(across(everything(), ~ replace(rep(NA_integer_, n()),
is.na(.), which(is.na(.)))[order(!is.na(.))]))
A B C D
1 1 2 3 NA
2 2 NA 4 NA
3 NA NA NA NA
4 NA NA NA NA
If we don't need to order the values, i.e. the positions remain in the same position of occcurence
NA^(!is.na(data0)) * row(data0)
A B C D
[1,] 1 NA NA NA
[2,] 2 2 NA NA
[3,] NA NA 3 NA
[4,] NA NA 4 NA
Upvotes: 1