Reputation: 179
I'm a beginner with OCaml and I want to read lines from a file and then examine all characters in each line. As a dummy example, let's say we want to count the occurrences of the character 'A' in a file.
I tried the following
open Core.Std
let count_a acc string =
let rec count_help res stream =
match Stream.peek stream with
| None -> res
| Some char -> Stream.junk stream; if char = 'A' then count_help (res+1) stream else count_help res stream
in acc + count_help 0 (Stream.of_string string)
let count_a = In_channel.fold_lines stdin ~init:0 ~f:count_a
let () = print_string ((string_of_int count_a)^"\n"
I compile it with
ocamlfind ocamlc -linkpkg -thread -package core -o solution solution.ml
run it with
$./solution < huge_file.txt
on a a file with one million lines which gives me the following times
real 0m16.337s
user 0m16.302s
sys 0m0.027s
which is 4 times more than my python implementation. I'm fairly sure that it should be possible to make this go faster, but I how should I go about doing this?
Upvotes: 1
Views: 341
Reputation: 35210
To count the number of A chars in a string you can just use String.count
function. Indeed, the simpliest solution will be:
open Core.Std
let () =
In_channel.input_all stdin |>
String.count ~f:(fun c -> c = 'A') |>
printf "we have %d A's\n"
A slightly more complicated (and less memory hungry solution), with [fold_lines] will look like this:
let () =
In_channel.fold_lines stdin ~init:0 ~f:(fun n s ->
n + String.count ~f:(fun c -> c = 'A') s) |>
printf "we have %d A's\n"
Indeed, it is slower, than the previous one. It takes 7.3 seconds on my 8-year old laptop, to count 'A' in 20-megabyte text file. And 3 seconds on a former solution.
Also, you can find this post interesting, I hope.
Upvotes: 3