Reputation: 1
I am trying to find patterns in missing values in rows.
For example if I have this data set:
a b c d
1 0.1 NA NA
2 NA 3 4
5 NA 6 NA
I expect the output to be:
n a b c d m
1 0 0 1 1 2
1 0 1 0 0 1
1 0 1 0 1 2
where column n shows the number of rows missing values in column m and 1's indicate missing values (except for columns n and m) .That is, the interpretation of the first row of the output is as follows: 1 row is missing 2 values which are for variables c and d; second row: 1 row is missing 1 value in variable b and so on.
I have tried using the subtable() function in extracat package(archived version) but I cant find the locations of missing values in each variables. I can only find frequencies.
rowmiss<-rowSums(is.na(dat1[1:ncol(dat1)]))
r1<-matrix(rowmiss, nrow=nrow(dat1))
subtable(rowmiss,1)
I expect the output to be as shown above. What I am finding so far is the frequency of missing values in rows but I expect patterns and positions of missing values.
Upvotes: 0
Views: 470
Reputation: 60180
An alternative way of doing this with tidyverse
:
library(tidyverse)
df %>%
mutate_all(~ is.na(.) %>% as.numeric()) %>%
mutate(m = rowSums(.)) %>%
group_by_all() %>%
count()
Output (you may also want to ungroup()
if doing anything further with the df
):
# A tibble: 3 x 6
# Groups: a, b, c, d, m [3]
a b c d m n
<dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 0 0 1 1 2 1
2 0 1 0 0 1 1
3 0 1 0 1 2 1
mice::md.pattern()
also does basically what you want, but returns a matrix with some of the useful info in the rownames, so would require a bit of processing to trun into a dataframe.
Upvotes: 1
Reputation: 66915
Here's a tidyverse approach. The n
column seems redundant, should it be doing something else?
library(tidyverse)
df %>%
rowid_to_column() %>%
gather(col, val, -rowid) %>%
mutate(val = is.na(val) * 1) %>%
group_by(rowid) %>% mutate(m = sum(val)) %>% ungroup() %>%
spread(col, val) %>%
mutate(n = 1) %>%
select(n, a:d, m)
# A tibble: 3 x 6
n a b c d m
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 0 0 1 1 2
2 1 0 1 0 0 1
3 1 0 1 0 1 2
Upvotes: 1