Reputation: 177
Some time ago, I decided to solve a simple task on HackerRank but using OCaml and Core, in order to learn them. In one of the tasks, I'm supposed to read data from standard input:
The first line contains an integer, denoting the number of entries in the phone book. Each of the subsequent lines describes an entry in the form of space-separated values on a single line. The first value is a friend's name, and the second value is an -digit phone number.
After the lines of phone book entries, there are an unknown number of lines of queries. Each line (query) contains a to look up, and you must continue reading lines until there is no more input.
The main issues:
scanf "%s\n"
until End_of_file
And my code became messy:
open Core.Std
open Printf
open Scanf
let read_numbers n =
let phone_book = String.Table.create () ~size:n in
for i = 0 to (n - 1) do
match In_channel.input_line stdin with
| Some line -> (
match (String.split line ~on:' ') with
| key :: data :: _ -> Hashtbl.set phone_book ~key ~data
| _ -> failwith "This shouldn't happen"
)
| None -> failwith "This shouldn't happen"
done;
phone_book
let () =
let rec loop phone_book =
match In_channel.input_line stdin with
| Some line -> (
let s = match Hashtbl.find phone_book line with
| Some number -> sprintf "%s=%s" line number
| None -> "Not found"
in
printf "%s\n%!" s;
loop phone_book
)
| None -> ()
in
match In_channel.input_line stdin with
| Some n -> (
let phone_book = read_numbers (int_of_string n) in
loop phone_book
)
| None -> failwith "This shouldn't happen"
If I solve this task in Python, then code looks like this:
n = int(input())
book = dict([tuple(input().split(' ')) for _ in range(n)])
while True:
try:
name = input()
except EOFError:
break
else:
if name in book:
print('{}={}'.format(name, book[name]))
else:
print('Not found')
This is shorter and clearer than the OCaml code. Any advice on how to improve my OCaml code? And there two important things: I don't want to abandon OCaml, I just want to learn it; second - I want to use Core because of the same reason.
Upvotes: 6
Views: 8081
Reputation: 8720
As in Python, the key to a concise implementation is to let the standard library do most of the work; the following code uses Sequence.fold
in lieu of Python's list comprehension. Also, using Pervasives.input_line
rather than In_channel.input_line
allows you to cut down on extraneous pattern matching (it will report an end of file condition as an exception rather than a None
result).
open Core.Std
module Dict = Map.Make(String)
let n = int_of_string (input_line stdin)
let d = Sequence.fold
(Sequence.range 0 n)
~init:Dict.empty
~f:(fun d _ -> let line = input_line stdin in
Scanf.sscanf line "%s %s" (fun k v -> Dict.add d ~key:k ~data:v))
let () =
try while true do
let name = input_line stdin in
match Dict.find d name with
| Some number -> Printf.printf "%s=%s\n" name number
| None -> Printf.printf "Not found.\n"
done with End_of_file -> ()
Upvotes: 6
Reputation: 35210
The direct implementation of the Python code in OCaml would look like this:
let exec name =
In_channel.(with_file name ~f:input_lines) |> function
| [] -> invalid_arg "Got empty file"
| x :: xs ->
let es,qs = List.split_n xs (Int.of_string x) in
let es = List.map es ~f:(fun entry -> match String.split ~on:' ' entry with
| [name; phone] -> name,phone
| _ -> invalid_arg "bad entry format") in
List.iter qs ~f:(fun name ->
match List.Assoc.find es name with
| None -> printf "Not found\n"
| Some phone -> printf "%s=%s\n" name phone)
However, OCaml is not a script-language for writing small scripts and one shot prototypes. It is the language for writing real software, that must be readable, supportable, testable, and maintainable. That's why we have types, modules, and all the stuff. So, if I were writing a production quality program, that is responsible for working with such input, then it will look very differently.
The general style that I personally employ, when I'm writing a program in a functional language is to follow these two simple rules:
I.e., allocate a type for each concept in the program domain, and use lots of small function.
The following code is twice as big, but is more readable, maintainable, and robust.
So, first of all, let's type: the entry is simply a record. I used a string type to represent a phone for simplicity.
type entry = {
name : string;
phone : string;
}
The query is not specified in the task, so let's just stub it with a string:
type query = Q of string
Now our parser state. We have three possible states: the Start
state, a state Entry n
, where we're parsing entries with n
entries left so far, and Query
state, when we're parsing queries.
type state =
| Start
| Entry of int
| Query
Now we need to write a function for each state, but first of all, let's define an error handling policy. For a simple program, I would suggest just to fail on a parser error. We will call a function named expect
when our expectations fail:
let expect what got =
failwithf "Parser error: expected %s got %s\n" what got ()
Now the three parsing functions:
let parse_query s = Q s
let parse_entry s line = match String.split ~on:' ' line with
| [name;phone] -> {name;phone}
| _ -> expect "<name> <phone>" line
let parse_expected s =
try int_of_string s with exn ->
expect "<number-of-entries>" s
Now let's write the parser:
let parse (es,qs,state) input = match state with
| Start -> es,qs,Entry (parse_expected input)
| Entry 0 -> es,qs,Query
| Entry n -> parse_entry input :: es,qs,Entry (n-1)
| Query -> es, parse_query input :: qs,Query
And finally, let's read data from file:
let of_file name =
let es,qs,state =
In_channel.with_file name ~f:(fun ch ->
In_channel.fold_lines ch ~init:([],[],Start) ~f:parse) in
match state with
| Entry 0 | Query -> ()
| Start -> expect "<number-of-entries><br>..." "<empty>"
| Entry n -> expect (sprintf "%d entries" n) "fewer"
We also check that our state machine reached a proper finish state, that is it is either in Query
or Entry 0
state.
Upvotes: 9