update rows based on column condition in r

Question

My dataset looks like:

PID  V1       V2          V3                  V4     V5
123  1   13-06-2004   12-08-2002        19-03-2003   2
123  2   19-07-2008   18-05-2006        31-05-2007   2
234  1   08-07-2010   07-07-2007        07-05-2008   3
345  1   11-12-2012   13-11-2011        12-06-2012   1
456  1   17-09-2018   15-08-2015        29-10-2016   3

I have done year calculation in V5 by V2 - V3 / 365 Now I need PID with single record alone need to be update by v2-v4 column and expected output:

PID     V1      V2         V3           V4      V5
123     1   13-06-2004  12-08-2002  19-03-2003  2
123     2   19-07-2008  18-05-2006  31-05-2007  2
234     1   08-07-2010  07-07-2007  07-05-2008  2
345     1   11-12-2012  13-11-2011  12-06-2012  0.6
456     1   17-09-2018  15-08-2015  29-10-2016  2

I struggle to update with single record alone.

Ronak Shah · Accepted Answer

My numbers don't match to your expected output but based on your description I think you are trying to update V5 values for groups with 1 row which can be done as :

library(dplyr)

df %>%
  mutate(across(V2:V4, lubridate::dmy)) %>%
  group_by(PID) %>%
  mutate(V5 = if(n() == 1) as.numeric((V2-V4)/365) else V5)

#    PID    V1 V2         V3         V4            V5
#                   
#1   123     1 2004-06-13 2002-08-12 2003-03-19 2    
#2   123     2 2008-07-19 2006-05-18 2007-05-31 2    
#3   234     1 2010-07-08 2007-07-07 2008-05-07 2.17 
#4   345     1 2012-12-11 2011-11-13 2012-06-12 0.499
#5   456     1 2018-09-17 2015-08-15 2016-10-29 1.88

update rows based on column condition in r

Answers (2)

Related Questions