Reputation: 1280
I have a dataset that looks like this :
id1 | var1 |
---|---|
A | chair |
B | table |
C | glass |
D | phone |
E | pistol |
and a second one that contains the licenses of each id (but it could contain different ids)
id2 | var2 |
---|---|
G | mobile |
H | pistol |
I | pistol |
E | phone |
D | phone |
I want to check if the ids in the first dataframe are licensed to have what they declare. For example the id D is licensed to have a phone but E is not licensed to have pistol because it's licensed to have phone. So there are three conditions here and ideally the final data frame must look like this:
id1 | var1 | license |
---|---|---|
A | chair | not_needed |
B | table | not_needed |
C | glass | not_needed |
D | phone | ok_checked |
E | pistol | danger |
How can I make this cross check and print these messages according to the logical condition in R using dplyr
?
library(tidyverse)
id1 =c("A","B","C","D","E")
var1 = c("chair","table","glass","phone","pistol")
data1 = tibble(id1,var1);data1
id2 = c("G","H","I","E","D")
var2 = c("mobile","pistol","pistol","phone","phone")
data2 = tibble(id2,var2);data2
Upvotes: 0
Views: 832
Reputation: 76
library(dplyr)
data1 |>
full_join(data2, by=c("id1"="id2")) |>
rename(declared=var1, actual=var2) |>
mutate(license=ifelse(is.na(declared),"Not declared",
ifelse(declared %in% c("chair","table","glass"),"Not needed",
ifelse(declared==actual,"OK","Danger"))))
I have use full_join. You can use let join if you don't need all the ids.
I have also assumed that chair, table and gas do not need license. You can add or remove items for your need.
Finally, you can remove the columns that you don't need.
It is also possible to use case_when()
function instead of ifelse()
statements to achieve the same result.
id1 declared actual license
<chr> <chr> <chr> <chr>
1 A chair NA Not needed
2 B table NA Not needed
3 C glass NA Not needed
4 D phone phone OK
5 E pistol phone Danger
6 G NA mobile Not declared
7 H NA pistol Not declared
8 I NA pistol Not declared
Upvotes: 1
Reputation: 19097
You can first left_join
the two dataset, then use a case_when
statement to assign terms to the condition.
library(tidyverse)
left_join(data1, data2, by = c("id1" = "id2")) %>%
mutate(var2 = case_when(is.na(var2) ~ "not_needed",
var1 == var2 ~ "ok_checked",
var1 != var2 ~ "danger",
TRUE ~ NA_character_)) %>%
rename("license" = "var2")
# A tibble: 5 × 3
id1 var1 license
<chr> <chr> <chr>
1 A chair not_needed
2 B table not_needed
3 C glass not_needed
4 D phone ok_checked
5 E pistol danger
Upvotes: 1