mdcq
mdcq

Reputation: 2026

How to partition a string into two groups using a regex?

I'd like to partition a string into two groups by providing the regex for only one group in Rust.

The regex for the opposite group is not known. I only know the regex for the separator.

For example, with the regex \d+ and the following string

123abcdef456ghj789

I'd like to obtain both these two strings

abcdefghj

and

123456789

Using the regex and itertools crates, I'm able to get the first group like this

let text = "123abcdef456ghj789";

let re = Regex::new(r"\d+").unwrap();

let text1 = re.split(text).join(""); //abcdefghj

How can I get the second group?

Upvotes: 0

Views: 149

Answers (3)

Kaplan
Kaplan

Reputation: 3758

A bit more extensive but w/o an external library:

let re = regex::Regex::new(r"\d+").unwrap();
let mut text1 = String::new();
let mut text2 = String::new();
let mut beg = 0;
let txt = "123abcdef456ghj789";
for r in re.find_iter(txt).map(|m| m.range()) {
    text1 += &txt[r.clone()];
    text2 += &txt[beg..r.start];
    beg = r.end;
}
text2 += &txt[beg..];
println!("{text1}\n{text2}");

Playground

Upvotes: 0

kmdreko
kmdreko

Reputation: 60493

You can get the desired result very similarly:

re.find_iter(text).map(|m| m.as_str()).join("");

.find_iter() returns all matches as an iterator, which you can then call .as_str() on get the full matched text. And then of course use .join() from itertools as you've done before.

Full example on the playground.


It would be nice though if there was a single method that returned a tuple of the disjoined partitions.

It would be nice and certainly possible since the matches return all the information needed to slice-and-dice the text in one pass. Here's my attempt that iteratively calls .find_at():

fn partition_regex(re: &Regex, text: &str) -> (String, String) {
    let mut a = String::new();
    let mut b = String::new();

    let mut search_idx = 0;
    while let Some(m) = re.find_at(text, search_idx) {
        a.push_str(m.as_str());
        b.push_str(&text[search_idx..m.start()]);
        search_idx = m.end();
    }
    b.push_str(&text[search_idx..]);
    
    (a, b)
}

Full example on the playground.

Upvotes: 3

pigeonhands
pigeonhands

Reputation: 3424

You can use partition to create two sets based on a predicate.

let re = Regex::new(r"(^[a-z]+)").unwrap();

let (matches, non_matches): (String, String) 
    = content.lines().partition(|x| re.is_match(x));

Upvotes: 0

Related Questions