Reputation: 701
For each of these IDs, I would like to create a new column called Age populated with the values 0 to 5 for each ID
(r=0:5) as shown below.
Data Frame
ID
1124
1123
Desired Outcome
ID Age
1124 0
1124 1
1124 2
1124 3
1124 4
1124 5
1123 0
1123 1
1123 2
1123 3
1123 4
1123 5
Upvotes: 0
Views: 285
Reputation: 5532
Here is a base R version:
df = data_frame(ID = c(1124, 1123))
expand.grid(ID = df$ID, Age = 0:5)
## ID Age
## 1 1124 0
## 2 1123 0
## 3 1124 1
## 4 1123 1
## 5 1124 2
## 6 1123 2
## 7 1124 3
## 8 1123 3
## 9 1124 4
## 10 1123 4
## 11 1124 5
## 12 1123 5
This is sorted differently from the tidyr::expand
result.
EDIT
As @thelatemail suggested, you can do the following to avoid renaming df
expand.grid(c(Age=list(0:5), df))
or
merge(df, list(Age=0:5))
EDIT 2
Here is a data.table
example:
library(data.table)
setDT(df) # Convert df to a data.table.
df[, do.call(CJ, list(ID = ID, Age = 0:5))]
For large data sets, one might want to benchmark the various methods.
Upvotes: 2
Reputation: 60060
This can be done with tidyr::expand
:
library(tidyverse)
df = data_frame(ID = c(1124, 1123))
df %>%
expand(ID, Age = 0:5)
Output:
# A tibble: 12 x 2
ID Age
<dbl> <int>
1 1123 0
2 1123 1
3 1123 2
4 1123 3
5 1123 4
6 1123 5
7 1124 0
8 1124 1
9 1124 2
10 1124 3
11 1124 4
12 1124 5
Upvotes: 1
Reputation: 378
library(tidyverse)
your_data_frame %>%
group_by(ID) %>%
mutate(Age = (1:n()) - 1)
This works also if you have more than 6 Age
values per ID
.
Upvotes: 0