Reputation: 81
How do I separate a string into a list/array of white space separated words.
let x = "this is my sentence";;
And store them inan list/array like this:
["this", "is", "my", "sentence"]
Upvotes: 2
Views: 1452
Reputation: 36496
Posting from a future that involves sequences, to offer an alternative way that doesn't necessarily have to involve creating an entire list, unless you actually need that.
We can lazily iterate over a string, character by character, and use an aux
function to decide when to yield a word, using an argument to that function to build up each word in turn, and to reset it after it has been yielded.
module CharSet = Set.Make (Char)
let split_words seps s =
let rec aux seq cur () =
match seq () with
| Seq.Nil when cur = "" -> Seq.Nil
| Seq.Nil -> Seq.Cons (cur, Seq.empty)
| Seq.Cons (ch, next) ->
let is_sep = CharSet.mem ch seps in
if is_sep && cur = "" then
aux next "" ()
else if is_sep then
Seq.Cons (cur, aux next "")
else
aux next (Printf.sprintf "%s%c" cur ch) ()
in
aux (String.to_seq s) ""
# let x = "this is my sentence" in
x
|> split_words @@ CharSet.of_list [' '; '\t'; '\n']
|> List.of_seq;;
- : string list = ["this"; "is"; "my"; "sentence"]
# let x = "this is my sentence" in
x
|> split_words @@ CharSet.of_list [' '; '\t'; '\n']
|> Array.of_seq;;
- : string array = [|"this"; "is"; "my"; "sentence"|]
Upvotes: 0
Reputation:
The full process goes like this:
first opam install re
if you are using utop
, then you can do something like this
#require "re.pcre"
let () =
Re_pcre.split ~rex:(Re_pcre.regexp " +") "Hello world more"
|> List.iter print_endline
and then just run it with utop code.ml
if you want to compile native code, then you'd have:
let () =
Re_pcre.split ~rex:(Re_pcre.regexp " +") "Hello world more"
|> List.iter print_endline
Notice how the #require
is gone.
then at command line you'd do: ocamlfind ocamlopt -package re.pcre code.ml -linkpkg -o Test
The OCaml website has tons of tutorials and help, I also have a blog post designed to get you up to speed quickly: http://hyegar.com/2015/10/20/so-youre-learning-ocaml/
Upvotes: 1
Reputation: 2706
Using the standard library Str split_delim and the regexp type.
Str.split_delim (Str.regexp " ") "this is my sentence";;
- : bytes list = ["this"; "is"; "my"; "sentence"]
Highly recommend getting UTop, it's really good for quickly searching through Libraries (I typed Str
, saw it was there, then Str.
and looked for the appropriate function).
Upvotes: 2