Reputation: 73
I have a data frame listed as follows:
unique Treatment Rep Beak time nx survival
1.1.1 1 1 1 0 25 0
1.1.1 1 1 1 0 25 0
1.1.1 1 1 1 0 25 0
1.1.1 1 1 1 2 24 0
1.1.1 1 1 1 2 24 0
1.1.1 1 1 1 4 17 1
1.1.1 1 1 1 4 17 1
1.1.1 1 1 1 4 17 1
1.1.1 1 1 1 4 17 1
1.1.2 1 1 2 0 25 0
1.1.2 1 1 2 0 25 0
1.1.2 1 1 2 2 22 0
1.1.2 1 1 2 2 22 0
1.1.2 1 1 2 2 22 0
1.1.2 1 1 2 2 22 0
1.1.2 1 1 2 4 16 1
1.1.2 1 1 2 4 16 1
1.1.2 1 1 2 4 16 1
1.1.2 1 1 2 4 16 1
I need to filter out the rows where survival is 0, but still make sure that the time
for those individuals is represented. In essence, I would like to modify the rows such that if the values in nx
are >
the minimum nx
value grouped by unique
, the number of rows should equal the maximum nx
value of that group subtracted by nx
. This is the code I came up with:
df <- df %>%
group_by(unique) %>%
mutate(nx = case_when(
nx > min(nx) ~ rep(.$nx, each = max(.$nx)-.$nx)))
The desired data frame should look like:
unique Treatment Rep Beak time nx survival
1.1.1 1 1 1 2 24 0 #one row left with nx of 24
1.1.1 1 1 1 4 17 1
1.1.1 1 1 1 4 17 1
1.1.1 1 1 1 4 17 1
1.1.1 1 1 1 4 17 1
1.1.2 1 1 2 2 22 0 #3 rows left with nx of 22
1.1.2 1 1 2 2 22 0
1.1.2 1 1 2 2 22 0
1.1.2 1 1 2 4 16 1
1.1.2 1 1 2 4 16 1
1.1.2 1 1 2 4 16 1
1.1.2 1 1 2 4 16 1
I seem to be having trouble with replicating the rows the appropriate number of times. I tried to coerce it to a matrix and set nrow = max(.$nx)-.$nx
but it didn't work out. Can anyone offer some advice?
Upvotes: 0
Views: 43
Reputation: 3739
Data:
dat <- data_frame(unique = c(rep("1.1.1", 9),
rep("1.1.2", 10)),
treatment = rep(1, 19),
Rep = rep(1, 19),
Break = c(rep(1, 9),
rep(2, 10)),
time = c(0, 0, 0, 2, 2,
4, 4, 4, 4,
0, 0,
2, 2, 2, 2,
4, 4, 4, 4),
nx = c(25, 25, 25,
24, 24,
17, 17, 17, 17,
25, 25,
22, 22, 22, 22,
16, 16, 16, 16),
survival = c(rep(0, 5),
rep(1, 4),
rep(0, 6),
rep(1, 4))
)
First, group by unique
and create a variable called keep
to find the difference between max(nx)
and nx
within group. Then, group by unique
and survival
. Keep every instance were survival == 1
and every instance where survival == 0
and nx
is equal to the minimum. After this, we need to filter out each instance of survival == 0
that is greater than keep
for that {unique, survival} group. We can use row_numbers
to accomplish this, making sure we still keep each instance of survival == 0
.
dat %>%
group_by(unique) %>%
mutate(keep = max(nx) - nx) %>%
group_by(unique, survival) %>%
filter(survival == 0 & nx == min(nx) |
survival == 1) %>%
filter(row_number() %in% 1:unique(keep) |
survival == 1) %>%
select(-keep) %>%
ungroup()
Result:
# A tibble: 12 x 7
unique treatment Rep Break time nx survival
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1.1.1 1. 1. 1. 2. 24. 0.
2 1.1.1 1. 1. 1. 4. 17. 1.
3 1.1.1 1. 1. 1. 4. 17. 1.
4 1.1.1 1. 1. 1. 4. 17. 1.
5 1.1.1 1. 1. 1. 4. 17. 1.
6 1.1.2 1. 1. 2. 2. 22. 0.
7 1.1.2 1. 1. 2. 2. 22. 0.
8 1.1.2 1. 1. 2. 2. 22. 0.
9 1.1.2 1. 1. 2. 4. 16. 1.
10 1.1.2 1. 1. 2. 4. 16. 1.
11 1.1.2 1. 1. 2. 4. 16. 1.
12 1.1.2 1. 1. 2. 4. 16. 1.
Upvotes: 1