partial string matching in R? Is this possible?

Question

I'm not actually sure if this is possible. I have these two data frames that have scientific names. Some of them are misspelled, some have missing spaces, others are homonyms (not the same species), and others match. So I have something like this:

stringDF <- data.frame(string = c("Abietinella abietina (Hedw.) M.Fleisch.", "Abietinella abietina (Hedw.) M. Fleisch.", "Abietinella abietina (Hedw.) Smith", "Abitinella abietina (Hedw.) M. Fleisch."))
patternDF <- data.frame(string = "Abietinella abietina (Hedw.) M. Fleisch.", match = "A")

patternDF has the "correct name" plus a column (that I'm calling "match" containing important information. I'm trying to make a "match" column in stringDF where "A" is pasted when it matches partially. So ideally, I'd like something like this:

string                                      match
Abietinella abietina (Hedw.) M.Fleisch.     A
Abietinella abietina (Hedw.) M. Fleisch.    A
Abietinella abietina (Hedw.) Smith          NA
Abitinella abietina (Hedw.) M. Fleisch.     A

I've tried using this function:

stringDF$match <- patternDF$match[pmatch(stringDF$string, patternDF$string)]

but I'm not having any luck. Is this possible to do in R? I've also tried using the %like% function from the data.frame package.

I'm not the best at coding, so sorry in advance for my ignorance! Thanks y'all!

partial string matching in R? Is this possible?

Answers (1)

Related Questions