Reputation: 304
Hi I've been checking on the internet to find a good way to implement "whether a string ends with certain text" in OCaml and I found that to manipulate string in OCaml is not as trivial as I expected compared to other programming language like Java.
Here is my OCaml code using Str.regexp to check if the file name ends with ".ml" to see if it is an OCaml script file. It does not work as I expected though:
let r = Str.regexp "*\\.ml" in
if (Str.string_match r file 0)
then
let _ = print_endline ("Read file: "^full_path) in
readFile full_path
else
print_endline (full_path^" is not an OCaml file")
Note that readFile is a function written by myself to read the file from constructed full_path. I always got results in the output such as
./utilities/dict.ml is not an OCaml file
./utilities/dict.mli is not an OCaml file
./utilities/error.ml is not an OCaml file
./utilities/error.mli is not an OCaml file
What is wrong with my regexp in OCaml and is there a better/simpler code for checking string?
Upvotes: 5
Views: 3459
Reputation: 446
Probably, you are confused with two styles of regular expressions:
bash
or other shells)*
matches empty string or a sequence of any characters in this style.You need to check the document of str
carefully.
http://caml.inria.fr/pub/docs/manual-ocaml/libref/Str.html
This says
. : Matches any character except newline
* : Matches the preceding expression zero, one or several times
You see, str
library adopts latter style.
So, to define Str.regexp
, you need to write like
let r = Str.regexp ".*\.ml";;
val r : Str.regexp = <abstr>
Str.string_match r "fuga.ml" 0;;
- : bool = true
Str.string_match r "fugaml" 0;;
- : bool = false
Str.string_match r "piyo/null/fuga.ml" 0;;
- : bool = true
If you want to use glob style regular expressions,
you can use re.
In my opinion, you don't need to use a regexp to solve your problem.
Just judge whether the input includes substring ".ml" via appropriate functions.
Upvotes: 3
Reputation: 35210
First of all your regexp is incorrect, you forgot .
before the *
, the correct version is:
let r = Str.regexp {|.*\.ml|}
Note the usage of a new string literal syntax, that allows you to write regex in a nicer way without tons of backslashes. Using a regular syntax, with double quotes, it should look like this:
let r = Str.regexp ".*\\.ml"
This regular expression is not ideal, as it will match with file.mlx
, file.ml.something.else
, etc. So, a better version, that will match with all possible OCaml source file names, is
let r = Str.regexp {|.*\.ml[ily]?$|}
Instead of using regexp you can also use Filename
module from the standard library, that has a check_suffix
function:
let is_ml file = Filename.check_suffix file ".ml"
To check all possible extensions:
let srcs = [".ml"; ".mli"; ".mly"; ".mll"]
let is_ocaml file = List.exists (Filename.check_suffix file) srcs
Upvotes: 8