Reputation: 1099
I am new to OCaml, but I have experience in F# and Haskell. I'm really surprised by the apparent lack of functionalities that seem elementary in a standard library, to illustrate, I just want to read the content of a file so that the text can then be parsed (several times). There doesn't seem to be any function that returns the content of a file (there is In_Channel.read_all
, but this is part of the Janes Street library, which is not cross-platform and therefore I don't want to use it).
So I implemented my function with what the standard lib offers, but on the one hand I really don't find it very idiomatic and on the other hand it's very slow, so I wonder how I could make it more efficient or better: if there is no other more efficient way to do what I want.
Here is the function:
let read_file filename =
let res = ref "" in
let read_contents = open_in filename in
try while true
do res := input_line read_contents ^ !res ^ "\n"
done; !res
with End_of_file -> close_in read_contents; !res
Moreover, if the file starts with new lines, the resulting string will not have taken them into account, which is a bit annoying, but not too serious in my case.
Upvotes: 0
Views: 91
Reputation: 66803
It's true, the OCaml standard library is quite sparse.
If you don't mind using Unix primitives (many of which also work on Windows) you can read a file with just one read call like this:
let read_whole_file filename =
let open Unix in
let fd = openfile filename [O_RDONLY] 0o666 in
let len = lseek fd 0 SEEK_END in
ignore (lseek fd 0 SEEK_SET);
let res = Bytes.make len '\000' in
if read fd res 0 len <> len then
failwith "partial read";
close fd;
res
Note that this returns the result as bytes (a mutable array of characters, in essence). You can convert to string if necessary. In recent OCaml versions strings are immutable (which is how they should be IMHO).
Update
I don't know how I missed these yesterday, but there are functions in the standard library that will do this. Here's a revised version in case it's useful:
let read_whole_file filename =
let chan = open_in_bin filename in
let res =
really_input_string chan (in_channel_length chan)
in
close_in chan;
res
Note that this uses open_in_bin
to avoid modification of line endings under Windows. This is necessary (I believe) to get agreement with the length returned by in_channel_length
.
(It's still true that the OCaml standard library is pretty sparse.)
Upvotes: 2
Reputation: 18892
You can use containers and its read_all function as an extension to the standard library.
Concerning your function, it is accidentally quadratic because
res := input_line read_contents ^ !res ^ "\n"
is reallocating a new string for each new line. It is better to use Buffer
(or String.concat
) when building a string by repeatedly appending small strings.
Upvotes: 4