Reputation: 312

Generate a sequence of new rows based on a few columns

I want to create new rows in sequence for a group of variables in a data frame. For example, I have these dummy data

data1 <- data.frame(id = c('JUJ', 'SJD'), 
                    sex = c('male', 'female'),
                    year = c(2000, 2010),
                    age = c(48, 75), blood = c(6.85, 4.6))
data1

| id  | sex    | year | age | blood |
|-----|--------|------|-----|-------|
| JUJ | male   | 2000 | 48  | 6.85  |
| SJD | female | 2010 | 75  | 4.6   |

I want to generate 4 more observations (as rows) for each id. And for year and age, each new row should be larger by 1 unit from the previous row. And for some variables, for example in these data, sex and blood should stay the same in all rows.

I am sure seq() function in R will work but some how I could find the right way to use it. I would prefer if the solution contains tidyverse function.

At the end, the data would look like this

data2 <- data.frame(id = c('JUJ', 'JUJ', 'JUJ', 'JUJ', 'SJD', 'SJD', 
                   'SJD', 'SJD'), 
                    sex = c('male', 'male', 'male', 'male', 'female', 
                   'female', 'female', 'female'),
                    year = c(2000, 2001, 2002, 2003, 2010, 2011, 2012, 2013),
                    age = c(48, 49, 50, 51, 75, 76, 77, 78), 
                    blood = c(6.85, 6.85, 6.85, 6.85, 4.6, 4.6, 4.6, 4.6))
data2

| id  | sex    | year | age | blood |
|-----|--------|------|-----|-------|
| JUJ | male   | 2000 | 48  | 6.85  |
| JUJ | male   | 2001 | 49  | 6.85  |
| JUJ | male   | 2002 | 50  | 6.85  |
| JUJ | male   | 2003 | 51  | 6.85  |
| SJD | female | 2010 | 75  | 4.6   |
| SJD | female | 2011 | 76  | 4.6   |
| SJD | female | 2012 | 77  | 4.6   |
| SJD | female | 2013 | 78  | 4.6   |

Upvotes: 3

Answers (3)

M--

Reputation: 29237

Another tidyverse solution:

library(tidyverse)

data1 %>% 
  mutate_at(vars(year, age), list(~ map(. ,~seq(.x, .x + 4 - 1))))%>% 
  unnest %>% select(-blood, blood)

#>    id    sex year age blood
#> 1 JUJ   male 2000  48  6.85
#> 2 JUJ   male 2001  49  6.85
#> 3 JUJ   male 2002  50  6.85
#> 4 JUJ   male 2003  51  6.85
#> 5 SJD female 2010  75  4.60
#> 6 SJD female 2011  76  4.60
#> 7 SJD female 2012  77  4.60
#> 8 SJD female 2013  78  4.60

Upvotes: 0

tmfmnk

Reputation: 40171

Another dplyr and tidyr possibility could be:

data1 %>%
 group_by(id) %>%
 uncount(4) %>%
 mutate_at(vars(year, age), ~ . + row_number() - 1)

  id    sex     year   age blood
  <fct> <fct>  <dbl> <dbl> <dbl>
1 JUJ   male    2000    48  6.85
2 JUJ   male    2001    49  6.85
3 JUJ   male    2002    50  6.85
4 JUJ   male    2003    51  6.85
5 SJD   female  2010    75  4.6 
6 SJD   female  2011    76  4.6 
7 SJD   female  2012    77  4.6 
8 SJD   female  2013    78  4.6

Upvotes: 3

Ronak Shah

Reputation: 389325

We can use slice to repeat rows for n times, group_by id and sequentially increment age and year column.

library(dplyr)

n <- 4
data1 %>%
  slice(rep(seq_len(n()), each = n)) %>%
  group_by(id) %>%
  mutate_at(vars(year, age), ~. + 0:(n - 1))

#  id    sex     year   age blood
#  <fct> <fct>  <dbl> <dbl> <dbl>
#1 JUJ   male    2000    48  6.85
#2 JUJ   male    2001    49  6.85
#3 JUJ   male    2002    50  6.85
#4 JUJ   male    2003    51  6.85
#5 SJD   female  2010    75  4.6 
#6 SJD   female  2011    76  4.6 
#7 SJD   female  2012    77  4.6 
#8 SJD   female  2013    78  4.6

Upvotes: 4

Generate a sequence of new rows based on a few columns

Answers (3)

Related Questions