Reputation: 2546
I'm trying to match people that meet a certain job code, but there's many abbreviations (e.g., "dr." and "dir" are both director. For some reason, my code yields obviously wrong answers (e.g., it retains 'kvp coordinator' in the below example), and I can't figure out what's going on:
library(dplyr)
library(stringr)
test <- tibble(name = c("Corey", "Sibley", "Justin", "Kate", "Ruth", "Phil", "Sara"),
title = c("kvp coordinator", "manager", "director", "snr dr. of marketing", "drawing expert", "dir of finance", "direct to mail expert"))
test %>%
filter(str_detect(title, "chief|vp|president|director|dr\\.|dir\\ |dir\\."))
In the above example, only Justin, Kate, and Phil should be left, but somehow the filter doesn't drop Corey.
In addition to an answer, if you could explain why I'm getting this bizarre result, I'd really appreciate it.
Upvotes: 1
Views: 247
Reputation: 11548
the vp
in str_detect
pattern matches with kvp
, that's why you are getting it in the output.
test %>% filter(str_detect(title, "chief|\\bvp\\b|president|director|dr\\.|dir\\ |dir\\."))
# A tibble: 3 x 2
name title
<chr> <chr>
1 Justin director
2 Kate snr dr. of marketing
3 Phil dir of finance
Upvotes: 1