Reputation: 3763
I want to parse a string containing ASCII characters between single quotes and that can contain escaped single quotes by two ' in a row.
'string value contained between single quotes -> '' and so on...'
which should result in:
string value contained between single quotes -> ' and so on...
use nom::{
bytes::complete::{tag, take_while},
error::{ErrorKind, ParseError},
sequence::delimited,
IResult,
};
fn main() {
let res = string_value::<(&str, ErrorKind)>("'abc''def'");
assert_eq!(res, Ok(("", "abc\'def")));
}
pub fn is_ascii_char(chr: char) -> bool {
chr.is_ascii()
}
fn string_value<'a, E: ParseError<&'a str>>(i: &'a str) -> IResult<&'a str, &'a str, E> {
delimited(tag("'"), take_while(is_ascii_char), tag("'"))(i)
}
How can I detect escaped quotes and not the end of the string?
Upvotes: 3
Views: 1706
Reputation: 28572
I'm learning nom and below is my trying.
let a = r###"'string value contained between single quotes -> '' and so on...'"###;
fn parser(input: &str) -> IResult<&str, &str> {
let len = input.chars().count() - 2;
delimited(tag("'"), take(len), tag("'"))(input)
}
let (remaining, mut matched) = parser(a).unwrap_or_default();
let sss = matched.replace("''", "'");
matched = &sss;
println!("remaining: {:#?}", remaining);
println!("matched: {:#?}", matched);
It prints this result:
remaining: ""
matched: "string value contained between single quotes -> ' and so on..."
My testing is based on nom 6.2.1.
Upvotes: 0
Reputation: 13942
This is pretty tricky, but the following works:
//# nom = "5.0.1"
use nom::{
bytes::complete::{escaped_transform, tag},
character::complete::none_of,
combinator::{recognize, map_parser},
multi::{many0, separated_list},
sequence::delimited,
IResult,
};
fn main() {
let (_, res) = parse_quoted("'abc''def'").unwrap();
assert_eq!(res, "abc'def");
let (_, res) = parse_quoted("'xy@$%!z'").unwrap();
assert_eq!(res, "xy@$%!z");
let (_, res) = parse_quoted("'single quotes -> '' and so on...'").unwrap();
assert_eq!(res, "single quotes -> ' and so on...");
}
fn parse_quoted(input: &str) -> IResult<&str, String> {
let seq = recognize(separated_list(tag("''"), many0(none_of("'"))));
let unquote = escaped_transform(none_of("'"), '\'', tag("'"));
let res = delimited(tag("'"), map_parser(seq, unquote), tag("'"))(input)?;
Ok(res)
}
Some explanations:
seq
recognizes any sequence that alternates between double quotes and anything else;unquote
transforms any double quotes into single one;map_parser
then combines the two together to produce the desired result.Be aware that due to the use of escaped_transform
combinator, the parsing result is String
instead of &str
. I.e., there are extra allocations.
Upvotes: 6