user11418708
user11418708

Reputation: 902

Count NA's per id

I would like to know how many NAs there are per ID. For example in the case of ID = 1 from Monday till Friday there are a total of 4 NA. Is there any solution in R?

My df:

enter image description here

My output:

enter image description here

My sample data:

    structure(list(id = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 
    2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), 
        start.end = c("Mo_start", "Mo_end", "Tue_start", "Tue_end", 
        "Wed_start", "Wed_end", "Thur_start", "Thur_end", "Fri_start", 
        "Fri_end", "Mo_start", "Mo_end", "Tue_start", "Tue_end", 
        "Wed_start", "Wed_end", "Thur_start", "Thur_end", "Fri_start", 
        "Fri_end", "Mo_start", "Mo_end", "Tue_start", "Tue_end", 
        "Wed_start", "Wed_end", "Thur_start", "Thur_end", "Fri_start", 
        "Fri_end", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA), time = structure(c(NA, NA, 
        25200, 53100, 25200, 53100, 25200, 53100, NA, NA, NA, NA, 
        NA, NA, 32400, 56700, NA, NA, NA, NA, 35100, 53100, 23400, 
        53100, 23400, 31500, 23400, 31500, 23400, 53100, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA), class = c("hms", "difftime"), units = "secs"), 
        X4 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA)), class = c("spec_tbl_df", "tbl_df", 
    "tbl", "data.frame"), row.names = c(NA, -80L), spec = structure(list(
        cols = list(id = structure(list(), class = c("collector_double", 
        "collector")), start.end = structure(list(), class = c("collector_character", 
        "collector")), time = structure(list(format = ""), class = c("collector_time", 
        "collector")), X4 = structure(list(), class = c("collector_logical", 
        "collector"))), default = structure(list(), class = c("collector_guess", 
        "collector")), skip = 1L), class = "col_spec"))

Upvotes: 0

Views: 270

Answers (3)

TarJae
TarJae

Reputation: 78927

Update: Using setNames for adequate column names:

setNames(aggregate(time ~ id, data=df, function(x) {sum(is.na(x))}, na.action = NULL), c("id", "NAs/day"))

Output:

  id NAs/day
1  1       4
2  2       8
3  3       0

First answer: We could use aggregate:

aggregate(time ~ id, data=df, function(x) {sum(is.na(x))}, na.action = NULL)

Output

  id time
1  1    4
2  2    8
3  3    0

Upvotes: 1

foreach
foreach

Reputation: 99

library(data.table)
dataset <- data.table(dataset)
dataset[is.na(time),.N,by=id]

Upvotes: 1

Limey
Limey

Reputation: 12461

df %>%  
  group_by(id) %>%  
  summarise(Missing=sum(is.na(time)), .groups="drop")

Giving

# A tibble: 4 x 2
     id Missing
  <dbl>   <int>
1     1       4
2     2       8
3     3       0
4    NA      50

Your sample data appears not to match your expected output.

Upvotes: 2

Related Questions