Nautica
Nautica

Reputation: 2018

Duplicating subset of rows with duplicates having different values?

Sample of data:

   Date    Value GeographyName                               Newdate
   <chr>   <dbl> <chr>                                         <int>
 1 2011/12 0.698 NHS Wigan Borough CCG                          2012
 2 2011/12 0.674 NHS Gateshead CCG                              2012
 3 2012/13 0.775 NHS North Hampshire CCG                        2013
 4 2012/13 0.686 NHS St Helens CCG                              2013
 5 2012/13 0.716 NHS Wakefield CCG                              2013
 6 2012/13 0.750 NHS West Lancashire CCG                        2013
 7 2012/13 0.722 NHS Hull CCG                                   2013
 8 2013/14 0.746 NHS Brent CCG                                  2014
 9 2013/14 0.776 NHS Hambleton, Richmondshire and Whitby CCG    2014
10 2013/14 0.686 NHS Barnsley CCG                               2014

I wish to duplicate the rows with the year 2012 in the Newdate vector three times, for a total of six new duplicated rows. However, I want two of the new rows to now have a Newdate value of 2011, another two rows to have a value of 2010, and the last two of the new rows to have a value of 2009. Is there a way to achieve this in the duplication process?

Upvotes: 1

Views: 31

Answers (1)

MKR
MKR

Reputation: 20085

The dplyr::bind_rows provides flexibility to bind rows of multiple data frames. One can first filter df to have rows containing Newdate == 2012 and then merge it multiple times using bind_rows. Modify the Newdate per description by OP and then merge it back with original df.

library(dplyr)

df %>% filter(Newdate == 2012) %>%
  bind_rows(., ., .) %>%  #Duplicating rows 3 times
  mutate(Newdate = Newdate - (row_number()+1) %/% 2) %>%
  bind_rows(df, .)

#       Date Value                               GeographyName Newdate
# 1  2011/12 0.698                       NHS Wigan Borough CCG    2012
# 2  2011/12 0.674                           NHS Gateshead CCG    2012
# 3  2012/13 0.775                     NHS North Hampshire CCG    2013
# 4  2012/13 0.686                           NHS St Helens CCG    2013
# 5  2012/13 0.716                           NHS Wakefield CCG    2013
# 6  2012/13 0.750                     NHS West Lancashire CCG    2013
# 7  2012/13 0.722                                NHS Hull CCG    2013
# 8  2013/14 0.746                               NHS Brent CCG    2014
# 9  2013/14 0.776 NHS Hambleton, Richmondshire and Whitby CCG    2014
# 10 2013/14 0.686                            NHS Barnsley CCG    2014
# 11 2011/12 0.698                       NHS Wigan Borough CCG    2011
# 12 2011/12 0.674                           NHS Gateshead CCG    2011
# 13 2011/12 0.698                       NHS Wigan Borough CCG    2010
# 14 2011/12 0.674                           NHS Gateshead CCG    2010
# 15 2011/12 0.698                       NHS Wigan Borough CCG    2009
# 16 2011/12 0.674                           NHS Gateshead CCG    2009

Data:

df <- read.table(text = 
"Date    Value GeographyName                               Newdate
1 2011/12 0.698 'NHS Wigan Borough CCG'                          2012
2 2011/12 0.674 'NHS Gateshead CCG'                              2012
3 2012/13 0.775 'NHS North Hampshire CCG'                        2013
4 2012/13 0.686 'NHS St Helens CCG'                              2013
5 2012/13 0.716 'NHS Wakefield CCG'                              2013
6 2012/13 0.750 'NHS West Lancashire CCG'                        2013
7 2012/13 0.722 'NHS Hull CCG'                                   2013
8 2013/14 0.746 'NHS Brent CCG'                                  2014
9 2013/14 0.776 'NHS Hambleton, Richmondshire and Whitby CCG'    2014
10 2013/14 0.686 'NHS Barnsley CCG'                               2014",
stringsAsFactors = FALSE, header = TRUE)

Upvotes: 1

Related Questions