Reputation: 9110
basically in my project, I am trying to write a list of strings into file like this:
val mutable rodata_list : (string*string) list = []
.....
let zip1 ll =
List.map (fun (h,e) -> h^e) ll in
let oc = open_out_gen [Open_append; Open_creat] 0o666 "final_data.s" in
List.iter (fun l -> Printf.fprintf oc "%s\n" l) (zip1 rodata_list);
Here is my problem, usually the rodata_list
can reach as long as 800,000 size, and the above code on our server (64-bit, 32 core Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz
) takes about 3.5 seconds.. The OCaml
version I use is 4.01.0
.
This is not acceptable, especially as I have 4 piece of code like this to write into a file. Totally they could take me over 15 seconds..
I tried this:
Printf.fprintf oc "%s\n" (String.concat "\n" (zip1 rodata_list));
But no obvious improvement..
So I am wondering that, how to optimize this part? I appreciate any solutions. Thank you!
Upvotes: 0
Views: 224
Reputation: 35210
^
to concatenate a bunch of strings in performance critical code, as it will lead to quadratic complexity;*printf
functions, when performance matters (although in OCaml 4.02 it is pretty fast);So, given the advices above we have the following:
List.iter (fun (x,y) ->
output_string oc x;
output_string oc y;
output_char oc '\n') rodata_list
Also, any optimizations should start from profiling, to get the profile you need to compile it with profiling info, for example like this:
ocamlbuild myprogram.p.native
Then you can run program to collect the profile, that can be read with gprof
. My guess, that you will spend all the time not in the actual IO, or even concatenation, but in garbage collection, since your zip, will create millions of string.
So to proof, that you're actually trying to optimize wrong part of your code, I've wrote this small program:
let rec init_rev acc = function
| 0 -> acc
| n -> init_rev (("hello", "world") :: acc) (n-1)
let () = List.iter (fun (x,y) ->
print_string x;
print_endline y) (init_rev [] 1000_000)
It creates a list of one million elements and outputs it:
$ ocamlbuild main.native
$ time ./main.native > data.txt
real 0m0.998s
user 0m0.211s
sys 0m0.783s
This is on macbook laptop. Moreover we spend most of the time in the system, with only 200ms in OCaml. And a simple loop for 1000_000 iterations without creating a list, takes only 11ms.
So, profile.
Upvotes: 6