Reputation: 95
I have a dataset in a dyadic format and sorted by group and I am trying to add an observation to each group. I need this observation to also be integrated with the other pairs. Below is a reproducible example to show what I mean. Data is a simplified version of my dataset (it contains more groups essentially).
data <- data.frame(country1 = c("BEL", "FRA", "BEL", "FRA", "AUS", "ITA"),
country2 = c("FRA", "BEL", "FRA", "BEL", "ITA", "AUS"),
year = c(2001,2001,2002,2002,2002,2002),
id = c(1,1,1,1,2,2))
> data
country1 country2 year id
1 BEL FRA 2001 1
2 FRA BEL 2001 1
3 BEL FRA 2002 1
4 FRA BEL 2002 1
5 AUS ITA 2002 2
6 ITA AUS 2002 2
I would like to add a different country to each group. For instance, say I would like to add Luxembourg to group 1 and Portugal to group 2.
This is what the output I need should look like:
> data
country1 country2 year id
1 BEL FRA 2001 1
2 FRA BEL 2001 1
3 LUX BEL 2001 1
4 LUX FRA 2001 1
5 BEL LUX 2001 1
6 FRA LUX 2001 1
7 BEL FRA 2002 1
8 FRA BEL 2002 1
9 LUX BEL 2002 1
10 LUX FRA 2002 1
11 BEL LUX 2002 1
12 FRA LUX 2002 1
13 AUS ITA 2002 2
14 ITA AUS 2002 2
15 POR AUS 2002 2
16 POR ITA 2002 2
17 AUS POR 2002 2
18 ITA POR 2002 2
I found a workaround way but I don't know how to simplify this process and to automate it to some extent.
id1 <- data%>%
filter(id== 1) %>%
mutate(country3 = "LUX")
id1_1 <- id1 %>%
select(!country2) %>%
rename("country2" = "country3") %>%
distinct()
id1_2 <- id1 %>%
select(!country1) %>%
rename("country1" = "country3") %>%
distinct()
id1_2 <- id1_2 [, c(2,1,3,4)]
id1 <- rbind(id1_1, id1_2)
data<- rbind(data, id1)
This completes the dyads but it is quite tedious to do since I am trying to add about 100 countries to a hundred groups.
I can create either a vector or a data frame containing all the countries I need to add (and arrange them by group if necessary), but I just don't know how to use them to fill the main data. Thanks for any tips!
Upvotes: 1
Views: 79
Reputation: 7540
Would something like this work for you?
library(tidyverse)
data <- data.frame(country1 = c("BEL", "FRA", "BEL", "FRA", "AUS", "ITA"),
country2 = c("FRA", "BEL", "FRA", "BEL", "ITA", "AUS"),
year = c(2001,2001,2002,2002,2002,2002),
id = c(1,1,1,1,2,2))
additions <- tribble(
~id, ~country1,
1, "LUX",
2, "POR"
)
unique_combos <- data |>
distinct(id, year, country1) |>
rows_append(additions) |>
expand(year, nesting(id, country1)) |>
filter(!is.na(year))
unique_combos |>
rename(country2 = country1) |>
full_join(unique_combos) |>
filter(country1 != country2) |>
arrange(id, year, country1, country2)
#> Joining, by = c("year", "id")
#> # A tibble: 24 × 4
#> year id country2 country1
#> <dbl> <dbl> <chr> <chr>
#> 1 2001 1 FRA BEL
#> 2 2001 1 LUX BEL
#> 3 2001 1 BEL FRA
#> 4 2001 1 LUX FRA
#> 5 2001 1 BEL LUX
#> 6 2001 1 FRA LUX
#> 7 2002 1 FRA BEL
#> 8 2002 1 LUX BEL
#> 9 2002 1 BEL FRA
#> 10 2002 1 LUX FRA
#> # … with 14 more rows
Created on 2022-06-29 by the reprex package (v2.0.1)
Upvotes: 1