Reputation: 2012
I am trying to write a program that will read diff files and return the filenames, just the filenames. So I wrote the following code
open Printf
open Str
let syname: string = "diff --git a/drivers/usc/filex.c b/drivers/usc/filex"
let fileb =
let pat_filename = Str.regexp "a\/(.+)b" in
let s = Str.full_split pat_filename syname in
s
let print_split_res (elem: Str.split_result) =
match elem with
| Text t -> print_string t
| Delim d -> print_string d
let rec print_list (l: Str.split_result list) =
match l with
| [] -> ()
| hd :: tl -> print_split_res hd ; print_string "\n" ; print_list tl
;;
() = print_list fileb
upon running this I get the original sting diff --git a/drivers/usc/filex.c b/drivers/usc/filex
back as the output.
Whereas if I use the same regex pattern with the python standard library I get the desired result
import re
p=re.compile('a\/(.+)b')
p.findall("diff --git a/drivers/usc/filex.c b/drivers/usc/filex")
Output: ['drivers/usc/filex.c ']
What am I doing wrong?
Upvotes: 0
Views: 73
Reputation: 66818
Not to be snide, but the way to understand OCaml regular expressions is to read the documentation, not compare to things in another language :-) Sadly, there is no real standard for regular expressions across languages.
The main problem appears to be that parentheses in OCaml regular expressions match themselves. To get grouping behavior they need to be escaped with '\\'
. In other words, your pattern is looking for actual parentheses in the filename. Your code works for me if you change your regular expression to this:
Str.regexp "a/\\(.+\\)b"
Note that the backslashes must themselves be escaped so that Str.regexp
sees them.
You also have the problem that your pattern doesn't match the slash after b
. So the resulting text will start with a slash.
As a side comment, I also removed the backslash before /
, which is technically not allowed in an OCaml string.
Upvotes: 2