Reputation: 679
I need to filter (select) strings that follow certain rules, print them and count the number filtered strings. The input is a big string and I need to apply the following rules on each line:
ab
, cd
, pq
, or xy
aa
, ff
, yy
etcI'm using the regex
crate and it provides regex::RegexSet
so I can combine multiple rules. The rules I added are as follows
let regexp = regex::RegexSet::new(&[
r"^((?!ab|cd|pq|xy).)*", // rule 1
r"((.)\1{9,}).*", // rule 3
r"(\b[aeiyou]+\b).*", // rule 2
])
But I don't know how to use these rules to filter the lines and iterate over them.
pub fn p1(lines: &str) -> u32 {
lines
.split_whitespace().filter(|line| { /* regex filter goes here */ })
.map(|line| println!("{}", line))
.count() as u32
}
Also the compiler says that the crate doesn't support look-around, including look-ahead and look-behind.
Upvotes: 2
Views: 2781
Reputation: 15354
If you're looking to use a single regex, then doing this via the regex
crate (which, by design, and as documented, does not support look-around or backreferences) is probably not possible. You could use a RegexSet
, but implementing your third rule would require using a regex that lists every repetition of a Unicode letter. This would not be as bad if you were okay limiting this to ASCII, but your comments suggest this isn't acceptable.
So I think your practical options here are to either use a library that supports fancier regex features (such as fancy-regex
for a pure Rust library, or pcre2
if you're okay using a C library), or writing just a bit more code:
use regex::Regex;
fn main() {
let corpus = "\
baz
ab
cwm
foobar
quux
foo pq bar
";
let blacklist = Regex::new(r"ab|cd|pq|xy").unwrap();
let vowels = Regex::new(r"[aeiouy]").unwrap();
let it = corpus
.lines()
.filter(|line| !blacklist.is_match(line))
.filter(|line| vowels.is_match(line))
.filter(|line| repeated_letter(line));
for line in it {
println!("{}", line);
}
}
fn repeated_letter(line: &str) -> bool {
let mut prev = None;
for ch in line.chars() {
if prev.map_or(false, |prev| prev == ch) {
return true;
}
prev = Some(ch);
}
false
}
Playground link: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c0928793474af1f9c0180c1ac8fd2d47
Upvotes: 2