binaryworldexplorer
binaryworldexplorer

Reputation: 41

Ocaml recursion: Pattern Matching

I am trying to write to recursive fn that changes a unlabelled tree to a labelled tree with the use of a helper function.

required info:

type nucleotide = 
| G | C
| A | T 

type helix = nucleotide list

type tree = 
| Leaf of helix 
| Node of tree * tree

type labeled_tree =
| LLeaf of helix
| LNode of labeled_tree * helix * labeled_tree

This is the helper function:

let rec guess_parent_helix (x1: helix) (x2: helix) : helix =
 begin match (x1, x2) with
 ([], []) -> []
 | (h1 :: t1, h2 :: t2) ->
   (if h1 = h2 then h1 else A) :: guess_parent_helix t1 t2
 | _ -> failwith "invalid input: x1 and x2 have unequal lengths"
 end

I have written the following function but am unable to compile as it says I have not exhausted the pattern matching : Node( (Node(_ , _ ),_) ). What I do not understand is, how is this not included in the Node(rt,lt) pattern match:

let rec add_ancestor_labels (r: tree) : labeled_tree =
  begin match r with
    | Leaf x -> LLeaf x
    | Node (lt,rt) -> 
        begin match r with
        | Leaf x -> LLeaf x
        | Node (Leaf[m], Leaf[n]) -> 
            LNode (add_ancestor_labels lt, guess_parent_helix [m] [n], add_ancestor_labels rt) 
        end
    end

Upvotes: 3

Views: 940

Answers (3)

binaryworldexplorer
binaryworldexplorer

Reputation: 41

 let rec add_ancestor_labels (r: tree) : labeled_tree =
begin match r with
| Leaf x -> LLeaf x
| Node (lt,rt) -> 
    begin match (lt, rt) with
    | (Leaf m, Leaf n) -> 
        LNode (LLeaf m, guess_parent_helix m n, 
        LLeaf n) 
    | (lt,Leaf n)-> LNode(add_ancestor_labels lt, 
    guess_parent_helix (helix_of_tree (add_ancestor_labels lt)) n, LLeaf n)
    | (Leaf m,rt)->LNode(LLeaf m, guess_parent_helix 
    (helix_of_tree(add_ancestor_labels rt)) m, add_ancestor_labels rt)
    | (_,_)-> failwith"invalid input"
    end
end

Managed to finish it. Thank you for your help.

Upvotes: 1

gsg
gsg

Reputation: 9377

Exhaustiveness checking only applies to the cases of a single match. Information about which cases are possible will not be propagated to nested matches, which can occasionally lead to some "impossible" cases.

Where possible, it is better to extend the match with additional cases like this:

let rec add_ancestor_labels = function
  | Leaf x -> LLeaf x
  | Node (Leaf x, Leaf y) ->
     LNode (LLeaf x,
            guess_parent_helix x y,
            LLeaf y)
  | Node (Leaf x, Node (l, r)) ->
     LNode (LLeaf x,
            ...,
            LNode (add_ancestor_labels l,
                   ...,
                   add_ancestor_labels r))
  | Node (Node ..., Leaf x) -> ...
  | Node (Node ..., Node ...) -> ...

An alternative to repeating the outermost constructor is to bind the arguments to variables and then match on those:

let rec add_ancestor_labels = function
  | Leaf x -> LLeaf x
  | Node (l, r) ->
     begin
       match l, r with
        | ...
     end

Finally, note that Leaf [m] only matches leaves that have helixes of length 1 - leaves with helixes of other lengths are unhandled. It's not clear whether this is what you intended, but exhaustiveness checking will warn if you forget to handle the case of leaves of other lengths.

The warning messages that OCaml gives you for nonexhaustive matches should include some of the missing cases, so do read them carefully.

As for your function, I think it would be written like this:

let add_ancestor_labels tree =
  let rec label = function
    | Leaf x -> x, LLeaf x
    | Node (left, right) ->
       let llabel, ltree = label left in
       let rlabel, rtree = label right in
       let l = guess_parent_helix llabel rlabel in
       l, LNode (ltree, l, rtree) in
  snd (label tree)

(Which is very similar to Jeffrey's suggestion.)

Upvotes: 1

Jeffrey Scofield
Jeffrey Scofield

Reputation: 66793

It's the inner match that's incomplete. The outer match is fine.

Update

The code as it stands doesn't make a lot of sense. After matching r and finding that it's an internal node, you match r again. There's no need to do this, you already know it's an internal node.

It's hard to give specific advice since I don't know what the labels are supposed to look like. Here is some code that sets each label to the concatenation of the nucleotides below it:

let labeled_tree_of_tree tree =
    let rec go t =
        match t with
        | Leaf x -> (x, LLeaf x)
        | Node (lt, rt) ->
            let (llabel, llt) = go lt in
            let (rlabel, lrt) = go rt in
            let label = llabel @ rlabel in
            (label, LNode (llt, label, lrt))
    in
    let (_, ltree) = go tree in
    ltree

Maybe this will give a feel for what your code might look like. No doubt it will be more complicated than this.

Upvotes: 2

Related Questions