CaffeinateOften
CaffeinateOften

Reputation: 571

Convenient way to get the first char index of a given string that caused a specific text pattern not to match in Rust?

Language:

Rust

Rust regex crate: https://docs.rs/regex/1.5.4/regex/

Use case:

Printing friendly diagnostic message to user that inputs text that does not match an expected regex pattern e.g.

  1. if patterns are Regex::new(r"^--(\w+)=(\w+)$").unwrap(); and Regex = Regex::new(r"^-(\w+)$").unwrap();

  2. and user inputs "---abc"

  3. user can see diagnostic like:

    "---abc"
       ^ Problem with character "-" at index 2.
    
       Expecting format "--key=value".
                           ^ Does not match expected format at index 2.
    

Possible solution:

Can I do something with capture groups? (They might only be relevant if there is a match). If no solution with capture groups, what else?

 // "-a[bc..]" or "--key=value"
 lazy_static! {
     static ref SHORT_OPTION_RE: Regex = Regex::new(r"^-(\w+)$").unwrap();
     static ref LONG_OPTION_RE: Regex = Regex::new(r"^--(\w+)=(\w+)$").unwrap();
}

// long option example
let caps = LONG_OPTION_RE.captures(s).ok_or(e_msg)?;
let key = caps.get(1).unwrap().as_str().to_string();
let value = caps.get(2).unwrap().as_str().to_string();

if key.is_some { }

Issue: Can't get exact char index that caused capture group not to match.

Alternatives:

Out of scope:

I do not need recommendations for cli program libs/frameworks (unless you're pointing to an implementation detail within one)

Edit: Modified question to be more generic than just regex.

Upvotes: 1

Views: 775

Answers (1)

yolenoyer
yolenoyer

Reputation: 9445

I would use a parser like nom.

Here is a quick and partial implementation of your use case:

use nom::{
    bytes::complete::tag, character::complete::alphanumeric1, combinator::map, sequence::tuple,
    IResult,
};

#[derive(Debug)]
struct OptPair {
    key: String,
    value: String,
}

fn parse_option(input: &str) -> IResult<&str, OptPair> {
    map(
        tuple((tag("--"), alphanumeric1, tag("="), alphanumeric1)),
        |(_, k, _, v): (&str, &str, &str, &str)| OptPair {
            key: k.to_owned(),
            value: v.to_owned(),
        },
    )(input)
}

fn test_parse(input: &str) {
    println!("TEST: input = \"{}\":", input);
    match parse_option(input) {
        Ok((_, opt_pair)) => println!("  Ok, {:?}", opt_pair),
        Err(err) => match err {
            nom::Err::Incomplete(_) => eprintln!("  Incomplete"),
            nom::Err::Error(err) => {
                let offset = err.input.as_ptr() as usize - input.as_ptr() as usize;
                eprintln!("  Error at index {}", offset);
            }
            nom::Err::Failure(_err) => println!("  Failure"),
        },
    }
}

fn main() {
    test_parse("--foo=bar");
    test_parse("---foo=bar");
    test_parse("--foo=");
    test_parse("Hello");
}

Output:

TEST: input = "--foo=bar":
  Ok, OptPair { key: "foo", value: "bar" }
TEST: input = "---foo=bar":
  Error at index 2
TEST: input = "--foo=":
  Error at index 6
TEST: input = "Hello":
  Error at index 0

Upvotes: 1

Related Questions