Reputation: 3
Say I have two objects,
mixed
# A tibble: 7 x 2
genus epithet
<chr> <chr>
1 Vincetoxicum nigrum
2 Rosa multiflora
3 Quercus rubra
4 Acer saccharum
5 Rosa pendula
6 Vincetoxicum nigrum
7 Vincetoxicum nigrum
and
invasives
# A tibble: 4 x 2
genus epithet
<chr> <chr>
1 Larix pendula
2 Picea abies
3 Rosa multiflora
4 Vincetoxicum nigrum
I want to check whether both columns of "mixed" match with both the columns of "invasives", and get an index that would allow me to pull those matching from "mixed". Note that "pendula" is in "epithet" in both "mixed" and "invasives", but its corresponding row in the first column has "Larix" in "invasives" and "Rosa" in "mixed", so it is not included in the final product.
So once that index was created, I'm thinking I'd want to run:
columns_matched <- mixed[index,]
yielding:
columns_matched
# A tibble: 4 x 2
genus epithet
<chr> <chr>
1 Vincetoxicum nigrum
2 Rosa multiflora
3 Vincetoxicum nigrum
4 Vincetoxicum nigrum
csv versions of the tables:
genus,epithet
Vincetoxicum,nigrum
Rosa,multiflora
Quercus,rubra
Acer,saccharum
Rosa,pendula
Vincetoxicum,nigrum
Vincetoxicum,nigrum
genus,epithet
Larix,pendula
Picea,abies
Rosa,multiflora
Vincetoxicum,nigrum
Thanks.
Upvotes: 0
Views: 46
Reputation: 1800
The easiest answer that comes to mind is to just inner_join
your data-sets.
This way, only identical rows are left over:
library(tidyverse)
mixed <- read_csv('genus,epithet
Vincetoxicum,nigrum
Rosa,multiflora
Quercus,rubra
Acer,saccharum
Rosa,pendula
Vincetoxicum,nigrum
Vincetoxicum,nigrum')
#> Rows: 7 Columns: 2
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (2): genus, epithet
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
invasives <- read_csv('genus,epithet
Larix,pendula
Picea,abies
Rosa,multiflora
Vincetoxicum,nigrum')
#> Rows: 4 Columns: 2
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (2): genus, epithet
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
mixed %>%
inner_join(invasives)
#> Joining, by = c("genus", "epithet")
#> # A tibble: 4 × 2
#> genus epithet
#> <chr> <chr>
#> 1 Vincetoxicum nigrum
#> 2 Rosa multiflora
#> 3 Vincetoxicum nigrum
#> 4 Vincetoxicum nigrum
If you really wanted to have that index, you could just add a dummy-column to your mixed-tibble:
index <- mixed %>%
mutate(index = seq_along(genus)) %>%
inner_join(invasives) %>%
pull(index)
#> Joining, by = c("genus", "epithet")
mixed[index,]
#> # A tibble: 4 × 2
#> genus epithet
#> <chr> <chr>
#> 1 Vincetoxicum nigrum
#> 2 Rosa multiflora
#> 3 Vincetoxicum nigrum
#> 4 Vincetoxicum nigrum
Upvotes: 1