Reputation: 9110
I am using this to split strings:
let split = Str.split (Str.regexp_string " ") in
let tokens = split instr in
....
But the problem is that for example here is a sentence I want to parse:
pop esi
and after the split it turns to be (I use a helper function to print each item in the tokens
list):
item: popitem: item: item: item: esi
See, there are three spaces in the token list.
I am wondering if there is a string.split
like in Python which can parse instr
this way:
item: popitem: esi
Is it possible?
Upvotes: 13
Views: 20360
Reputation: 29106
Since OCaml 4.04.0 there is also String.split_on_char
, which you can combine with List.filter
to remove empty strings:
# "pop esi"
|> String.split_on_char ' '
|> List.filter (fun s -> s <> "");;
- : string list = ["pop"; "esi"]
No external libraries required.
Upvotes: 9
Reputation: 34353
This is how I split my lines into words:
open Core.Std
let tokenize line = String.split line ~on: ' ' |> List.dedup
Mind the single quotes around the space character.
Here's the documentation for String.split
: link
Upvotes: 1
Reputation: 1593
Using Jane Street's Core library, you can do:
let python_split x =
String.split_on_chars ~on:[ ' ' ; '\t' ; '\n' ; '\r' ] x
|> List.filter ~f:(fun x -> x <> "")
;;
Upvotes: 7
Reputation: 66803
Don't use Str.regexp_string
, it's only for matching fixed strings.
Use Str.split (Str.regexp " +")
Upvotes: 23