Reputation: 327
I have strings with descriptions of gender that I need to sort out. For instance, if I have the following,
string1 = "FEMALE AND FEMALE"
string2 = "FEMALE AND MALE"
I need to change string1
to say "MULTIPLE FEMALES", and string2
to say "BOTH MALE AND FEMALE".
Using gsub
, I am having trouble writing a substitution that recognizes string2
as different from string1
, because MALE is nested in FEMALE. Using "YEP" as a confirmation string first, I have tried the following with no luck,
gsub(".*FEMALE.*MALE.*", "YEP", string1)
gsub(".*FEMALE.*[^M]ALE.*", "YEP", string1)
gsub(".*FEMALE.*[^\b]MALE.*", "YEP", string1)
gsub(".*FEMALE.*(^\bMALE).*", "YEP", string1)
gsub(".*FEMALE.*MALE.*", "YEP", string2)
gsub(".*FEMALE.*[^M]ALE.*", "YEP", string2)
gsub(".*FEMALE.*[^\b]MALE.*", "YEP", string2)
gsub(".*FEMALE.*(^\bMALE).*", "YEP", string2)
I need to account for sequence of wildcard because not all strings will show as "FEMALE AND FEMALE" or "FEMALE AND MALE", sometimes they show up as "1 FEMALE 12 MALES" or "B FEMALE WITH 2X W FEMALE", etc.
Any ideas on how to deal with nested strings using regex?
Upvotes: 1
Views: 471
Reputation: 327
Ok, I figured this out right after I posted.
Running gsub(".*FEMALE.*\\b(M)ALE.*", "YEP", string1)
results in "FEMALE AND FEMALE"
, whereas gsub(".*FEMALE.*\\b(M)ALE.*", "YEP", string2)
results in "YEP"
. So this works.
Upvotes: 1