Reputation: 146
The following data frame:
id participate grade year
1 1 NA 4 1982
2 1 1 4 1982
3 1 4 4 1982
4 4 NA NA 1987
5 5 NA NA 1986
6 5 NA 1 1986
7 5 NA 1 1986
8 7 NA 2 1984
9 7 4 2 1984
10 7 1 2 1984
11 9 NA 1 1987
12 9 1 1 1987
13 10 NA NA 1984
14 10 NA 2 1984
15 10 4 2 1984
16 11 NA 4 1985
17 11 1 4 1985
18 13 NA 3 1985
19 13 1 3 1985
My goal is to identify and delete per group (id) the rows where "participate" is.na, BUT only if "participate" is filled in other rows within this group.
That means in this case: delete row 1 for id=1. For id=4 I don't delete because there is no more information within the group. The same is for id=5. Respectively, rows 8, 11, 13, 14 etc. should be deleted
Here is the desired output.
id participate grade year
1 1 1 4 1982
2 1 4 4 1982
3 4 NA NA 1987
4 5 NA NA 1986
5 5 NA 1 1986
6 5 NA 1 1986
7 7 4 2 1984
8 7 1 2 1984
9 9 1 1 1987
10 10 4 2 1984
11 11 1 4 1985
12 13 1 3 1985
Upvotes: 0
Views: 64
Reputation: 39174
# Load package
library(tidyverse)
# Create example dataset
dat <- data_frame(id = c(1L, 1L, 1L, 4L, 5L,
5L, 5L, 7L, 7L, 7L,
9L, 9L, 10L, 10L, 10L,
11L, 11L, 13L, 13L),
participate = c(NA, 1L, 4L, NA, NA,
NA, NA, NA, 4L, 1L,
NA, 1L, NA, NA, 4L,
NA, 1L, NA, 1L),
grade = c(4L, 4L, 4L, NA, NA,
1L, 1L, 2L, 2L, 2L,
1L, 1L, NA, 2L, 2L,
4L, 4L, 3L, 3L),
year = c(1982, 1982, 1982, 1987, 1986,
1986, 1986, 1984, 1984, 1984,
1987, 1987, 1984, 1984, 1984,
1985, 1985, 1985, 1985))
# Filter the data
dat2 <- dat %>%
group_by(id) %>%
filter(!is.na(participate) | all(is.na(participate)))
# See the result
dat2
Source: local data frame [12 x 4]
Groups: id [8]
id participate grade year
<int> <int> <int> <dbl>
1 1 1 4 1982
2 1 4 4 1982
3 4 NA NA 1987
4 5 NA NA 1986
5 5 NA 1 1986
6 5 NA 1 1986
7 7 4 2 1984
8 7 1 2 1984
9 9 1 1 1987
10 10 4 2 1984
11 11 1 4 1985
12 13 1 3 1985
Upvotes: 1