Reputation: 875

When does the relaxed value restriction kick in in OCaml?

Can someone give a concise description of when the relaxed value restriction kicks in? I've had trouble finding a concise and clear description of the rules. There's Garrigue's paper:

http://caml.inria.fr/pub/papers/garrigue-value_restriction-fiwflp04.pdf

but it's a little dense. Anyone know of a pithier source?

An Addendum

Some good explanations were added below, but I was unable to find an explanation there for the following behavior:

# let _x = 3 in (fun () -> ref None);;
- : unit -> 'a option ref = <fun>
# let _x = ref 3 in (fun () -> ref None);;
- : unit -> '_a option ref = <fun>

Can anyone clarify the above? Why does the stray definition of a ref within the RHS of the enclosing let affect the heuristic.

Upvotes: 9

Answers (4)

curiousleo

Reputation: 296

The question why the two examples given in the addendum are typed differently has puzzled me for a couple of days. Here is what I found by digging into the OCaml compiler's code (disclaimer: I'm neither an expert on OCaml nor on the ML type system).

Recap

# let _x = 3 in (fun () -> ref None);;        (*  (1)  *)
- : unit -> 'a option ref = <fun>

is given a polymorphic type (think ∀ α. unit → α option ref) while

# let _x = ref 3 in (fun () -> ref None);;    (*  (2)  *)
- : unit -> '_a option ref = <fun>

is given a monomorphic type (think unit → α option ref, that is, the type variable α is not universally quantified).

Intuition

For the purposes of type checking, the OCaml compiler sees no difference between example (2) and

# let r = ref None in (fun () -> r);;         (*  (3)  *)
- : unit -> '_a option ref = <fun>

since it doesn't look into the body of the let to see if the bound variable is actually used (as one might expect). But (3) clearly must be given a monomorphic type, otherwise a polymorphically typed reference cell could escape, potentially leading to unsound behaviour like memory corruption.

Expansiveness

To understand why (1) and (2) are typed the way they are, let's have a look at how the OCaml compiler actually checks whether a let expression is a value (i.e. "nonexpansive") or not (see is_nonexpansive):

let rec is_nonexpansive exp =
  match exp.exp_desc with
    (* ... *)
  | Texp_let(rec_flag, pat_exp_list, body) ->
      List.for_all (fun vb -> is_nonexpansive vb.vb_expr) pat_exp_list &&
      is_nonexpansive body
  | (* ... *)

So a let-expression is a value if both its body and all the bound variables are values.

In both examples given in the addendum, the body is fun () -> ref None, which is a function and hence a value. The difference between the two pieces of code is that 3 is a value while ref 3 is not. Therefore OCaml considers the first let a value but not the second.

Typing

Again looking at the code of the OCaml compiler, we can see that whether an expression is considered expansive determines how the type of the let-expressions is generalised (see type_expression):

(* Typing of toplevel expressions *)

let type_expression env sexp =
  (* ... *)
  let exp = type_exp env sexp in
  (* ... *)
  if is_nonexpansive exp then generalize exp.exp_type
  else generalize_expansive env exp.exp_type;
  (* ... *)

Since let _x = 3 in (fun () -> ref None) is nonexpansive, it is typed using generalize which gives it a polymorphic type. let _x = ref 3 in (fun () -> ref None), on the other hand, is typed via generalize_expansive, giving it a monomorphic type.

That's as far as I got. If you want to dig even deeper, reading Oleg Kiselyov's Efficient and Insightful Generalization alongside generalize and generalize_expansive may be a good start.

Many thanks to Leo White from OCaml Labs Cambridge for encouraging me to start digging!

Upvotes: 3

gasche

Reputation: 31459

Jeffrey provided the intuitive explanation of why the relaxation is correct. As to when it is useful, I think we can first reproduce the answer octref helpfully linked to:

You may safely ignore those subtleties until, someday, you hit a problem with an abstract type of yours that is not as polymorphic as you would like, and then you should remember than a covariance annotation in the signature may help.

We discussed this on reddit/ocaml a few months ago:

Consider the following code example:
module type S = sig
  type 'a collection
  val empty : unit -> 'a collection
end

module C : S = struct
  type 'a collection =
    | Nil
    | Cons of 'a * 'a collection
  let empty () = Nil
end

let test = C.empty ()
The type you get for test is '_a C.collection, instead of the 'a C.collection that you would expect. It is not a polymorphic type ('_a is a monomorphic inference variable that is not yet fully determined), and you won't be happy with it in most cases.

This is because C.empty () is not a value, so its type is not generalized (~ made polymorphic). To benefit from the relaxed value restriction, you have to mark the abstract type 'a collection covariant:
module type S = sig
  type +'a collection
  val empty : unit -> 'a collection
end
Of course this only happens because the module C is sealed with the signature S : module C : S = .... If the module C was not given an explicit signature, the type-system would infer the most general variance (here covariance) and one wouldn't notice that.

Programming against an abstract interface is often useful (when defining a functor, or enforcing a phantom type discipline, or writing modular programs) so this sort of situation definitely happens and it is then useful to know about the relaxed value restriction.

That's an example of when you need to be aware of it to get more polymorphism, because you set up an abstraction boundary (a module signature with an abstract type) and it doesn't work automatically, you have explicitly to say that the abstract type is covariant.

In most cases it happens without your notice, when you manipulate polymorphic data structures. [] @ [] only has the polymorphic type 'a list thanks to the relaxation.

A concrete but more advanced example is Oleg's Ber-MetaOCaml, which uses a type ('cl, 'ty) code to represent quoted expressions which are built piecewise. 'ty represents the type of the result of the quoted code, and 'cl is a kind of phantom region variable that guarantees that, when it remains polymorphic, the scoping of variable in quoted code is correct. As this relies on polymorphism in situations where quoted expressions are built by composing other quoted expressions (so are generally not values), it basically would not work at all without the relaxed value restriction (it's a side remark in his excellent yet technical document on type inference).

Upvotes: 5

Jeffrey Scofield

Reputation: 66818

I am not a type theorist, but here is my interpretation of Garrigue's explanation. You have a value V. Start with the type that would be assigned to V (in OCaml) under the usual value restriction. There will be some number (maybe 0) monomorphic type variables in the type. For each such variable that appears only in covariant position in the type (on the right sides of function arrows), you can replace it with a fully polymorphic type variable.

The argument goes as follows. Since your monomorphic variable is a variable, you can imagine replacing it with any single type. So you choose an uninhabited type U. Now since it is in covariant position only, U can in turn be replaced by any supertype. But every type is a supertype of an uninhabited type, hence it's safe to replace with a fully polymorphic variable.

So, the relaxed value restriction kicks in when you have (what would be) monomorphic variables that appear only in covariant positions.

(I hope I have this right. Certainly @gasche would do better, as octref suggests.)

Upvotes: 9

octref

Reputation: 6801

Although I'm not very familiar with this theory, I have asked a question about it.
gasche provided me with a concise explanation. The example is just a part of OCaml's map module. Check it out!
Maybe he will be able to provide you with a better answer. @gasche

Upvotes: 1