Conrad.Dean
Conrad.Dean

Reputation: 4421

"Stack overflow during evaluation" with stdlib List.map

Say I have a bunch of ones:

Array.make (int_of_float (2. ** 17.)) 1
    |> Array.to_list;;
- : int list
= [1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; ...]

And I want to map a function over these ones:

Array.make (int_of_float (2. ** 17.)) 1
    |> Array.to_list
    |> List.map (fun x -> x * 2);;
Stack overflow during evaluation (looping recursion?). 

Seems like it's too much! Looking into how List.map is implemented in Ocaml, I find this (https://github.com/ocaml/ocaml/blob/trunk/stdlib/list.ml#L57-L59):

let rec map f = function
    [] -> []
  | a::l -> let r = f a in r :: map f l

I'm pretty new to Ocaml, but it looks like map is written in a way that makes it not tail-recursive.

Is that right?

How do I map a function over lots of stuff?

Upvotes: 3

Views: 250

Answers (3)

PatJ
PatJ

Reputation: 6144

What you want to do is construct the list at the end of your calculation:

let construct_value _ =
 1 |> (fun x -> x * 2)

let rec construct_list f n =
 let rec aux acc n =
  if n < 0
  then acc
  else aux (f n) (n-1)
 in
 aux [] (n-1)

let result = construct_list construct_value (int_of_float (2. ** 17.))

Here you end up with the same list as in your calculations, but with way less time and memory usage. Note that the elements of the list are constructed from the bottom of the list (you have to do it for tail-recursion), I added an "index" parameter that corresponds to the exact position in your array (that I did not construct, another time and memory gain)

Also, if you want to do calculations on that big of a data structure, lists are probably not the best option, but that depends on what you intend to do with it after.

Upvotes: 2

ivg
ivg

Reputation: 35210

Yes, List.map in OCaml Standard Library is not tail-recursive. You have several options here:

  1. Use Array.map
  2. Use List.rev_map (possibly coined with List.rev)
  3. Use some other data structure that is more suitable to your task
  4. Don't use standard library

While the first two options are obvious, the latter two need some explanation.

Better data structure

If you really have a big amount of data, and if this data is numerical, then you should consider using Bigarrays. Also, depending on you use case, Maps and Hashtables might be better. People in functional programming languages tend to use lists everywhere, instead of choosing a proper data structure. Don't step into this pitfall.

Other libraries and why it is non tail recursive

The List.map is non tail recursive for a good reason. There is always a tradeoff between stack usage and performance. For small lists (and this is the most common use case) a non-tail recursive version is much faster. So the standard library decided to provide a fast List.map and in case if you need to work on big lists then you can use List.rev_map xs |> List.rev. Moreover, sometimes you can omit the List.rev part. So, the standard library, is not trying to think for you, it gives you a choice.

On the other hand, with time people managed to find some optimum in this tradeoff, i.e., having a constant stack version which is quite fast. The solution is to use a non tail recursive approach when a list is small, and then fallback to a slow version. Such implementations are provided by Janestreet Core Library, Batteries and Containers library.

So, you can switch to those libraries and forget about this issue. Though it is still highly advised to use List.map scarcely and with care, as this operation is always has linear memory consumption (either heap or stack memory), that can be avoided. So, it is better to use rev_map whenever it is possible.

Upvotes: 5

Pierre G.
Pierre G.

Reputation: 4431

You are correct List.map is not tail recursive , use List.rev_map instead which is tail recursive. ... And List.rev that should be tail recursive.

List documents all the function of List module, indicating whether it is not tail recursive.

Upvotes: 3

Related Questions