Cina
Cina

Reputation: 10199

add specific new row using dplyr based on some conditions in r

I have df as below, I want to add a new row based on ID and semester_num. so far using dplyr would be:

df %>%
 group_by(ID) %>%
 group_by(semster_num) %>%
 #add new row here  

I want the new row has all records similar to the previous row except the third column value (subject_result2) should be the same as column 4(Success) of the previous row.

tibble::tribble(
      ~ID, ~semester_num,   ~subject_result2,    ~Success,
  100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
  100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
  100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
  100000L,             2, "MATH1PassedTerm1", "Grad_ENSC",
  100000L,             2, "OTHERPassedTerm1", "Grad_ENSC",
  100000L,             2, "OTHERPassedTerm1", "Grad_ENSC",
  200000L,             1, "OTHERPassedTerm2", "fail",
  200000L,             1, "MATH1PassedTerm2", "fail",
  200000L,             2, "MATH1PassedTerm2", "fail",
  200000L,             2, "OTHERPassedTerm2", "fail"
  )

result: (I indicate the newly added rows)

          ~ID, ~semester_num,   ~subject_result2,    ~Success,
      100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
      100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
      100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
 >>   100000L,             1, "Grad_ENSC",        "Grad_ENSC",
      100000L,             2, "MATH1PassedTerm1", "Grad_ENSC",
      100000L,             2, "OTHERPassedTerm1", "Grad_ENSC",
      100000L,             2, "OTHERPassedTerm1", "Grad_ENSC",
 >>   100000L,             2, "Grad_ENSC",        "Grad_ENSC",
      200000L,             1, "OTHERPassedTerm2", "Grad_ENSC",
      200000L,             1, "MATH1PassedTerm2", "fail",
 >>   200000L,             1, "Fail",             "fail",
      200000L,             2, "MATH1PassedTerm2", "fail",
      200000L,             2, "OTHERPassedTerm2", "fail",
 >>   200000L,             2, "fail,              "fail

Please help to implement it in r. (it is totally fine to use other packages as well)

Upvotes: 1

Views: 960

Answers (1)

divibisan
divibisan

Reputation: 12155

You can do this by combining do with tibble::add_row. I based this answer on the answer to this question: Add row in each group using dplyr and add_row(), specifically the comment by @JasonWang

df %>%
    dplyr::group_by(ID, semester_num) %>%
    do(tibble::add_row(.,
                       ID = .$ID[1],
                       semester_num = .$semester_num[1],
                       subject_result2 = .$Success[nrow(.)], #Get the last row of the group
                       Success = .$Success[nrow(.)]))

# A tibble: 14 x 4
# Groups:   ID, semester_num [4]
       ID semester_num subject_result2  Success  
    <int>        <dbl> <chr>            <chr>    
 1 100000            1 OTHERPassedTerm1 Grad_ENSC
 2 100000            1 OTHERPassedTerm1 Grad_ENSC
 3 100000            1 OTHERPassedTerm1 Grad_ENSC
 4 100000            1 Grad_ENSC        Grad_ENSC
 5 100000            2 MATH1PassedTerm1 Grad_ENSC
 6 100000            2 OTHERPassedTerm1 Grad_ENSC
 7 100000            2 OTHERPassedTerm1 Grad_ENSC
 8 100000            2 Grad_ENSC        Grad_ENSC
 9 200000            1 OTHERPassedTerm2 fail     
10 200000            1 MATH1PassedTerm2 fail     
11 200000            1 fail             fail     
12 200000            2 MATH1PassedTerm2 fail     
13 200000            2 OTHERPassedTerm2 fail     
14 200000            2 fail             fail  

Normally tibble::add_row won't work with a grouped data frame, but by using do, we can apply it to each group separately without leaving the pipe.

Upvotes: 3

Related Questions