Reputation: 4421
Say I have a bunch of ones:
Array.make (int_of_float (2. ** 17.)) 1
|> Array.to_list;;
- : int list
= [1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; ...]
And I want to map a function over these ones:
Array.make (int_of_float (2. ** 17.)) 1
|> Array.to_list
|> List.map (fun x -> x * 2);;
Stack overflow during evaluation (looping recursion?).
Seems like it's too much! Looking into how List.map is implemented in Ocaml, I find this (https://github.com/ocaml/ocaml/blob/trunk/stdlib/list.ml#L57-L59):
let rec map f = function
[] -> []
| a::l -> let r = f a in r :: map f l
I'm pretty new to Ocaml, but it looks like map is written in a way that makes it not tail-recursive.
Is that right?
How do I map a function over lots of stuff?
Upvotes: 3
Views: 250
Reputation: 6144
What you want to do is construct the list at the end of your calculation:
let construct_value _ =
1 |> (fun x -> x * 2)
let rec construct_list f n =
let rec aux acc n =
if n < 0
then acc
else aux (f n) (n-1)
in
aux [] (n-1)
let result = construct_list construct_value (int_of_float (2. ** 17.))
Here you end up with the same list as in your calculations, but with way less time and memory usage. Note that the elements of the list are constructed from the bottom of the list (you have to do it for tail-recursion), I added an "index" parameter that corresponds to the exact position in your array (that I did not construct, another time and memory gain)
Also, if you want to do calculations on that big of a data structure, lists are probably not the best option, but that depends on what you intend to do with it after.
Upvotes: 2
Reputation: 35210
Yes, List.map
in OCaml Standard Library is not tail-recursive. You have several options here:
Array.map
List.rev_map
(possibly coined with List.rev
)While the first two options are obvious, the latter two need some explanation.
If you really have a big amount of data, and if this data is numerical, then you should consider using Bigarrays. Also, depending on you use case, Maps and Hashtables might be better. People in functional programming languages tend to use lists everywhere, instead of choosing a proper data structure. Don't step into this pitfall.
The List.map
is non tail recursive for a good reason. There is always a tradeoff between stack usage and performance. For small lists (and this is the most common use case) a non-tail recursive version is much faster. So the standard library decided to provide a fast List.map
and in case if you need to work on big lists then you can use List.rev_map xs |> List.rev
. Moreover, sometimes you can omit the List.rev
part. So, the standard library, is not trying to think for you, it gives you a choice.
On the other hand, with time people managed to find some optimum in this tradeoff, i.e., having a constant stack version which is quite fast. The solution is to use a non tail recursive approach when a list is small, and then fallback to a slow version. Such implementations are provided by Janestreet Core Library, Batteries and Containers library.
So, you can switch to those libraries and forget about this issue. Though it is still highly advised to use List.map
scarcely and with care, as this operation is always has linear memory consumption (either heap or stack memory), that can be avoided. So, it is better to use rev_map
whenever it is possible.
Upvotes: 5