jd06340
jd06340

Reputation: 73

Changing number of rows based on changing vector in R

I have a data frame listed as follows:

unique Treatment Rep Beak time  nx  survival
1.1.1          1   1    1    0  25         0
1.1.1          1   1    1    0  25         0
1.1.1          1   1    1    0  25         0
1.1.1          1   1    1    2  24         0
1.1.1          1   1    1    2  24         0
1.1.1          1   1    1    4  17         1
1.1.1          1   1    1    4  17         1
1.1.1          1   1    1    4  17         1
1.1.1          1   1    1    4  17         1
1.1.2          1   1    2    0  25         0
1.1.2          1   1    2    0  25         0
1.1.2          1   1    2    2  22         0
1.1.2          1   1    2    2  22         0
1.1.2          1   1    2    2  22         0
1.1.2          1   1    2    2  22         0
1.1.2          1   1    2    4  16         1
1.1.2          1   1    2    4  16         1
1.1.2          1   1    2    4  16         1
1.1.2          1   1    2    4  16         1

I need to filter out the rows where survival is 0, but still make sure that the time for those individuals is represented. In essence, I would like to modify the rows such that if the values in nx are > the minimum nx value grouped by unique, the number of rows should equal the maximum nx value of that group subtracted by nx. This is the code I came up with:

df <- df %>%
group_by(unique) %>%
mutate(nx = case_when(
nx > min(nx) ~ rep(.$nx, each = max(.$nx)-.$nx)))

The desired data frame should look like:

unique Treatment Rep Beak time  nx  survival
1.1.1          1   1    1    2  24         0 #one row left with nx of 24
1.1.1          1   1    1    4  17         1
1.1.1          1   1    1    4  17         1
1.1.1          1   1    1    4  17         1
1.1.1          1   1    1    4  17         1
1.1.2          1   1    2    2  22         0 #3 rows left with nx of 22
1.1.2          1   1    2    2  22         0
1.1.2          1   1    2    2  22         0
1.1.2          1   1    2    4  16         1
1.1.2          1   1    2    4  16         1
1.1.2          1   1    2    4  16         1
1.1.2          1   1    2    4  16         1

I seem to be having trouble with replicating the rows the appropriate number of times. I tried to coerce it to a matrix and set nrow = max(.$nx)-.$nx but it didn't work out. Can anyone offer some advice?

Upvotes: 0

Views: 43

Answers (1)

paqmo
paqmo

Reputation: 3739

Data:

dat <- data_frame(unique = c(rep("1.1.1", 9),
                             rep("1.1.2", 10)),
                  treatment = rep(1, 19),
                  Rep = rep(1, 19),
                  Break = c(rep(1, 9),
                            rep(2, 10)),
                  time = c(0, 0, 0, 2, 2, 
                           4, 4, 4, 4, 
                           0, 0, 
                           2, 2, 2, 2, 
                           4, 4, 4, 4),
                  nx = c(25, 25, 25,
                         24, 24, 
                         17, 17, 17, 17,
                         25, 25, 
                         22, 22, 22, 22,
                         16, 16, 16, 16),
                  survival = c(rep(0, 5),
                               rep(1, 4),
                               rep(0, 6),
                               rep(1, 4))
                  )

First, group by unique and create a variable called keep to find the difference between max(nx) and nx within group. Then, group by unique and survival. Keep every instance were survival == 1 and every instance where survival == 0 and nx is equal to the minimum. After this, we need to filter out each instance of survival == 0 that is greater than keep for that {unique, survival} group. We can use row_numbers to accomplish this, making sure we still keep each instance of survival == 0.

dat %>% 
  group_by(unique) %>%
  mutate(keep = max(nx) - nx) %>% 
  group_by(unique, survival) %>%
  filter(survival == 0 & nx == min(nx) |
           survival == 1) %>% 
  filter(row_number() %in% 1:unique(keep) |
           survival == 1) %>% 
  select(-keep) %>% 
  ungroup()

Result:

# A tibble: 12 x 7
   unique treatment   Rep Break  time    nx survival
   <chr>      <dbl> <dbl> <dbl> <dbl> <dbl>    <dbl>
 1 1.1.1         1.    1.    1.    2.   24.       0.
 2 1.1.1         1.    1.    1.    4.   17.       1.
 3 1.1.1         1.    1.    1.    4.   17.       1.
 4 1.1.1         1.    1.    1.    4.   17.       1.
 5 1.1.1         1.    1.    1.    4.   17.       1.
 6 1.1.2         1.    1.    2.    2.   22.       0.
 7 1.1.2         1.    1.    2.    2.   22.       0.
 8 1.1.2         1.    1.    2.    2.   22.       0.
 9 1.1.2         1.    1.    2.    4.   16.       1.
10 1.1.2         1.    1.    2.    4.   16.       1.
11 1.1.2         1.    1.    2.    4.   16.       1.
12 1.1.2         1.    1.    2.    4.   16.       1.

Upvotes: 1

Related Questions