Matthias M
Matthias M

Reputation: 33

Matching two dataframes by columns while preserving multiple matches

I have two datasets. One has the name of some ethnic groups (The "Branoo", "Sinatu" and Gulipo" and their political claims (Independence or Irredentism) by year:

 Groups <- data.frame(group_names = c("Barnoo", "Sinatu", "Gulipo"),
                 year = c("2000", "2000", "2000", "2001", "2001", "2001", "2002", "2002", "2002"),
                 claim = c("Independence", "Irredentism", "Irredentism", "Independence", "Irredentism", "Independence", "Independence", "NA", "Irredentism"))

enter image description here

A second dataset has yearly information on which states support each group's claim. One state can support multiple groups, and one group can be supported by multiple states:

Countries <- data.frame(group_names = c("Egypt", "Jordan", "Greece"),
                      year = c("2000", "2000", "2000", "2001", "2001", "2001", "2002", "2002", "2002"),
                      support_independence = c("Barnoo", "Sinatu", "NA", "Barnoo", "Sinatu", "NA", "NA", "Sinatu", "NA"),
                      support_independence2 = c("Sinatu", "NA", "NA", "Sinatu", "NA", "NA", "Sinatu", "NA", "NA"),
                      support_irredentism = c("NA", "NA", "NA", "NA", "NA", "NA", "NA", "NA", "Barnoo")

enter image description here

My goal is to have a table that lists each ethnic group, its claim, and its supporting states like so:

Group Year Supporter of Independence 1 Supporter of Independence 2 Supporter of Irredentism 1 Supporter of Irredentism 2
Branoo 2000 Egypt NA NA NA
Branoo 2001 Egypt NA NA NA
Sinatu 2000 Egypt Jordan NA NA
Sinatu 2001 Egypt Jordan NA NA

I feel like this should be relatively simple, but I'm having trouble dealing with the fact that I can't just match groups to countries 1:1 because some groups have multiple supporting countries.

Any help would be greatly appreciated!

Upvotes: 1

Views: 32

Answers (1)

Ma&#235;l
Ma&#235;l

Reputation: 52349

A crude way with tidyverse:

library(tidyverse)
left_join(Groups[1:2], Countries[1:3], by = c("year", "group_names" = "support_independence")) %>% 
  left_join(Countries[c(1, 2, 4)], by = c("year", "group_names" = "support_independence2")) %>% 
  left_join(Countries[c(1, 2, 5)], by = c("year", "group_names" = "support_irredentism")) %>% 
  set_names(colnames(Countries))
  group_names year support_independence support_independence2 support_irredentism
1      Barnoo 2000                Egypt                  <NA>                <NA>
2      Sinatu 2000               Jordan                 Egypt                <NA>
3      Gulipo 2000                 <NA>                  <NA>                <NA>
4      Barnoo 2001                Egypt                  <NA>                <NA>
5      Sinatu 2001               Jordan                 Egypt                <NA>
6      Gulipo 2001                 <NA>                  <NA>                <NA>
7      Barnoo 2002                 <NA>                  <NA>              Greece
8      Sinatu 2002               Jordan                 Egypt                <NA>
9      Gulipo 2002                 <NA>                  <NA>                <NA>

Upvotes: 1

Related Questions