Reputation: 843
I have a data frame containing a shape-file that I want to merge with another data-set that contains years. I'm interested in adding a variable with years to the former while all other variables remain the same for each year. I'm not sure how to do this.
As an example, say I have the following data-set:
a <- data.frame(code = c("aaa" , "bbb", "ccc") ,
item = c("apples" , "bananas" , "carrots") ,
id = c(1,2,3))
giving the following:
code item id
1 aaa apples 1
2 bbb bananas 2
3 ccc carrots 3
I would like to add a new variable called year
of length n that repeats all the same elements of the other variables for each year. For example, say I'd like to add the years 1990 to 1992 to an existing object like this:
code item id year
1 aaa apples 1 1990
2 aaa apples 1 1991
3 aaa apples 1 1992
4 bbb bananas 2 1990
5 bbb bananas 2 1991
6 bbb bananas 2 1992
7 ccc carrots 3 1990
8 ccc carrots 3 1991
9 ccc carrots 3 1992
Is there a way of doing this (for existing data frames)? For this example I used this code;
b <- data.frame(code = rep(c("aaa" , "bbb", "ccc") , each = 3) ,
item = rep(c("apples" , "bananas" , "carrots") , each = 3) ,
id = rep(c(1,2,3) , each = 3) ,
year = rep(c(1990:1992) , times = 3))
but this would not work (or is extremely inefficient) when the data-set is already there or extremely large. Is there a better way of doing this?
Upvotes: 4
Views: 529
Reputation: 16836
Using sqldf
:
library(sqldf)
a <- data.frame(
code = c("aaa" , "bbb", "ccc") ,
item = c("apples" , "bananas" , "carrots") ,
id = c(1, 2, 3)
)
y <- data.frame(year = 1990:1992)
sqldf("SELECT * FROM a
CROSS JOIN y")
Output
code item id year
1 aaa apples 1 1990
2 aaa apples 1 1991
3 aaa apples 1 1992
4 bbb bananas 2 1990
5 bbb bananas 2 1991
6 bbb bananas 2 1992
7 ccc carrots 3 1990
8 ccc carrots 3 1991
9 ccc carrots 3 1992
Upvotes: 1
Reputation: 25313
Another possible solution, based on tidyr::separate_rows
:
library(tidyverse)
a <- data.frame(code = c("aaa" , "bbb", "ccc") ,
item = c("apples" , "bananas" , "carrots") ,
id = c(1,2,3))
a %>%
mutate(year = list(1990:1992)) %>%
separate_rows(year, sep=",", convert = T)
#> # A tibble: 9 × 4
#> code item id year
#> <chr> <chr> <dbl> <int>
#> 1 aaa apples 1 1990
#> 2 aaa apples 1 1991
#> 3 aaa apples 1 1992
#> 4 bbb bananas 2 1990
#> 5 bbb bananas 2 1991
#> 6 bbb bananas 2 1992
#> 7 ccc carrots 3 1990
#> 8 ccc carrots 3 1991
#> 9 ccc carrots 3 1992
Upvotes: 2
Reputation: 78917
We could replicate the rows N
times and then mutate
the year
column:
library(dplyr)
N <- 3
a %>%
group_by(code) %>%
slice(rep(1:n(), each = N)) %>%
mutate(year= 1990:1992)
code item id year
<chr> <chr> <dbl> <int>
1 aaa apples 1 1990
2 aaa apples 1 1991
3 aaa apples 1 1992
4 bbb bananas 2 1990
5 bbb bananas 2 1991
6 bbb bananas 2 1992
7 ccc carrots 3 1990
8 ccc carrots 3 1991
9 ccc carrots 3 1992
Upvotes: 2
Reputation: 160407
Base R:
b <- data.frame(year = 1990:1992)
merge(a, b, by = NULL)
# code item id year
# 1 aaa apples 1 1990
# 2 bbb bananas 2 1990
# 3 ccc carrots 3 1990
# 4 aaa apples 1 1991
# 5 bbb bananas 2 1991
# 6 ccc carrots 3 1991
# 7 aaa apples 1 1992
# 8 bbb bananas 2 1992
# 9 ccc carrots 3 1992
Data
a <- structure(list(code = c("aaa", "bbb", "ccc"), item = c("apples", "bananas", "carrots"), id = c(1, 2, 3)), class = "data.frame", row.names = c(NA, -3L))
Upvotes: 4
Reputation: 886938
We can create a list
column grouped by 'code' or 'id' and then unnest
the list
library(dplyr)
library(tidyr)
a %>%
group_by(id) %>%
mutate(year = list(1990:1992)) %>%
ungroup %>%
unnest(year)
-output
# A tibble: 9 × 4
code item id year
<chr> <chr> <dbl> <int>
1 aaa apples 1 1990
2 aaa apples 1 1991
3 aaa apples 1 1992
4 bbb bananas 2 1990
5 bbb bananas 2 1991
6 bbb bananas 2 1992
7 ccc carrots 3 1990
8 ccc carrots 3 1991
9 ccc carrots 3 1992
Or use crossing
crossing(a, year = 1990:1992)
# A tibble: 9 × 4
code item id year
<chr> <chr> <dbl> <int>
1 aaa apples 1 1990
2 aaa apples 1 1991
3 aaa apples 1 1992
4 bbb bananas 2 1990
5 bbb bananas 2 1991
6 bbb bananas 2 1992
7 ccc carrots 3 1990
8 ccc carrots 3 1991
9 ccc carrots 3 1992
Upvotes: 3