mschmidt
mschmidt

Reputation: 2790

Odd results of regular expression matching of OCaml `Str` module

When I execute the following test program:

let re = Str.regexp "{\\(foo\\)\\(bar\\)?}"

let check s =
    try
        let n = Str.search_forward re s 0 in
        let a = Str.matched_group 1 s in
        let b = Str.matched_group 2 s in
        Printf.printf "'%s' => n=%d, a='%s', b='%s'\n" s n a b
    with
        _ -> Printf.printf "'%s' => not found\n" s

let _ =
    check "{foo}";
    check "{foobar}"

I get strange results. I.e.:

$ ocaml str.cma test.ml
'{foo}' => not found
'{foobar}' => n=0, a='foo', b='bar'

Is grouping via \\( and \\) incompatible with the ? operator? The documentation does not mention this.

Upvotes: 0

Views: 204

Answers (1)

Jeffrey Scofield
Jeffrey Scofield

Reputation: 66803

For your first example, there is no matching of group 2. So the call to Str.matched_group 2 raises Not_found.

To get finer grained results than in your check function you need to handle each group separately with its own try block. In principle any one call to Str.matched_group can raise Not_found (depending on properties of the regular expression and on the matched string).

I rewrote your check function like this:

let check s =
    let check1 n g =
        try
            let m = Str.matched_group g s in
            Printf.printf "'%s' group %d => n = %d, match = '%s'\n"
                s g n m
        with Not_found ->
            Printf.printf "'%s' group %d => not matched\n" s g
    in
    let n = Str.search_forward re s 0 in
    check1 n 1;
    check1 n 2

Here is the output for the revised code:

'{foo}' group 1 => n = 0, match = 'foo'
'{foo}' group 2 => not matched
'{foobar}' group 1 => n = 0, match = 'foo'
'{foobar}' group 2 => n = 0, match = 'bar'

Upvotes: 2

Related Questions