Reputation: 3466
I am trying to tokenize boolean algebra, so I need to check if the start of a str
matches a list of tokens (e.g. AND
, OR
, etc.). str.startsWith
in an if else would obviously work, but it is an ugly solution, so I am wondering if theres a way to match an str like
match st {
"OR", .. => ...
}
(obviously not real syntax)
Also: I can not rely on whitespaces or other characters following the String, ABxxxx
should not match, while ORxxxxx
should match.
Upvotes: 2
Views: 1165
Reputation: 15354
I think if you only need to match a few strings, then an if ... else if
tree or Martin's solution is probably best. Otherwise, I don't think there is a way to match string slice prefixes using match
.
If, however, you have many strings you need to match, I'd recommend AhoCorasick
, which will be algorithmically optimal. You can use the aho-corasick
crate.
use aho_corasick::{AhoCorasick, AhoCorasickBuilder, MatchKind};
const PATTERNS: &[&str] = &["OR", "AND"];
fn op<T: AsRef<[u8]>>(ac: &AhoCorasick, v: T) -> Option<&'static str> {
ac.find(v).map(|m| PATTERNS[m.pattern()])
}
fn main() {
let ac = AhoCorasickBuilder::new()
.auto_configure(PATTERNS)
.anchored(true)
.match_kind(MatchKind::LeftmostFirst)
.build(PATTERNS);
println!(
"{:?} / {:?} / {:?}",
op(&ac, "OR"),
op(&ac, "AND"),
op(&ac, "XOR")
)
}
The trick to make it work for prefixes here is to set anchored(true)
. This will ensure that any match returned must be anchored to the beginning of the haystack.
The LeftmostFirst
match kind setting may not be necessary, but it instructs the matcher to prefer earlier patterns over later patterns. e.g., if your patterns are Samwise
and Sam
, then LeftmostFirst
will ensure that Samwise
matches Samwise
and not Sam
. If none of your patterns have overlapping prefixes, then you can remove this setting.
Finally, the auto_configure
option will enable some optimizations in the automaton, such as using a DFA instead of an NFA if the number of patterns is small enough.
See it also on the playground.
Upvotes: 3