FunkySayu
FunkySayu

Reputation: 8061

Ocaml Pretty slow file reader

I have the following code to read a file, which delete every comments in the file.

let s_read_all line =
    if line = "" then
        raise Pbm_format_error
    else if line.[0] = '#' then
        ""  
    else
        line ^ "\n"
;;

let read_all flec =
    let rec loop accum_ref =
        let line = input_line flec in
        accum_ref := (!accum_ref) ^ (s_read_all line);
        loop accum_ref
    in  

    let accum_ref = ref "" in
    try 
        loop accum_ref
    with
        End_of_file -> !accum_ref
;;

My code is really slow for a 180k line (about 2 minutes). I execute it in the interpretor mode. Is that which make my code so slow ?

Upvotes: 1

Views: 122

Answers (2)

Jackson Tale
Jackson Tale

Reputation: 25812

You are using line ^ "\n" and (!accum_ref) ^ (s_read_all line);.

Like in Java, ^ is direct concat and will constantly create new strings. So i guess this is why it is slow for 180K lines.

You should use Buffer, just like StringBuilder in Java.

Also if you give a good initial length to Buffer.create, it will be slightly faster.

exception Pbm_format_error

let s_read_all line =
    if line = "" then
        raise Pbm_format_error
    else if line.[0] = '#' then
        ""  
    else
        line


let read_all flec =
    let rec loop accum_buf =
        let line = s_read_all (input_line flec) in
        Buffer.add_string accum_buf line;
        if line <> "" then Buffer.add_string accum_buf "\n" else ();
        loop accum_buf
    in  
    let accum_buf = Buffer.create (180 * 1000 * 128) in
    try 
        loop accum_buf
    with
        End_of_file -> Buffer.contents accum_buf

Upvotes: 1

R&#233;mi
R&#233;mi

Reputation: 8332

The problem is that string concatenation is slow. More precisely, its repeteted string concatenation that is slow. You should use Buffer instead of string for accumulating lines:

let read_all flec =
    let rec loop buffer =
        let line = input_line flec in
        Buffer.add_string buffer (s_read_all line);
        loop buffer
    in  

    let buffer = Buffer.create 180 in
    try 
        loop buffer
    with
        End_of_file -> Buffer.content buffer
;;

Upvotes: 5

Related Questions