Reputation: 902
I would like to know how many NAs
there are per ID
. For example in the case of ID = 1
from Monday till Friday there are a total of 4 NA
.
Is there any solution in R?
My df
:
My output:
My sample data:
structure(list(id = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA),
start.end = c("Mo_start", "Mo_end", "Tue_start", "Tue_end",
"Wed_start", "Wed_end", "Thur_start", "Thur_end", "Fri_start",
"Fri_end", "Mo_start", "Mo_end", "Tue_start", "Tue_end",
"Wed_start", "Wed_end", "Thur_start", "Thur_end", "Fri_start",
"Fri_end", "Mo_start", "Mo_end", "Tue_start", "Tue_end",
"Wed_start", "Wed_end", "Thur_start", "Thur_end", "Fri_start",
"Fri_end", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA), time = structure(c(NA, NA,
25200, 53100, 25200, 53100, 25200, 53100, NA, NA, NA, NA,
NA, NA, 32400, 56700, NA, NA, NA, NA, 35100, 53100, 23400,
53100, 23400, 31500, 23400, 31500, 23400, 53100, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA), class = c("hms", "difftime"), units = "secs"),
X4 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA)), class = c("spec_tbl_df", "tbl_df",
"tbl", "data.frame"), row.names = c(NA, -80L), spec = structure(list(
cols = list(id = structure(list(), class = c("collector_double",
"collector")), start.end = structure(list(), class = c("collector_character",
"collector")), time = structure(list(format = ""), class = c("collector_time",
"collector")), X4 = structure(list(), class = c("collector_logical",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1L), class = "col_spec"))
Upvotes: 0
Views: 270
Reputation: 78927
Update:
Using setNames
for adequate column names:
setNames(aggregate(time ~ id, data=df, function(x) {sum(is.na(x))}, na.action = NULL), c("id", "NAs/day"))
Output:
id NAs/day
1 1 4
2 2 8
3 3 0
First answer:
We could use aggregate
:
aggregate(time ~ id, data=df, function(x) {sum(is.na(x))}, na.action = NULL)
Output
id time
1 1 4
2 2 8
3 3 0
Upvotes: 1
Reputation: 99
library(data.table)
dataset <- data.table(dataset)
dataset[is.na(time),.N,by=id]
Upvotes: 1
Reputation: 12461
df %>%
group_by(id) %>%
summarise(Missing=sum(is.na(time)), .groups="drop")
Giving
# A tibble: 4 x 2
id Missing
<dbl> <int>
1 1 4
2 2 8
3 3 0
4 NA 50
Your sample data appears not to match your expected output.
Upvotes: 2