CaTFooD
CaTFooD

Reputation: 73

How to create a very large list in erlang

we use lists:seq(1,100) to create a list with 100 element and the create speed is acceptable

but what if we create a list with 1000w element? lists:seq(1,10000000) is very slow.

do you have any idea to create a large list without any datacopy?

Upvotes: 2

Views: 975

Answers (3)

Elzor
Elzor

Reputation: 804

In fact you could try to use lazy lists: It's look like this:

seq(M, N) when M =< N ->
    fun() -> [M | seq(M+1, N)] end;
seq(_, _) ->
    fun () -> [] end.

E = seq(1, 1000000000000000).
[1|NextElem] =  E().
[2|NextElem2] = NextElem().

Upvotes: 2

tkowal
tkowal

Reputation: 9289

Actually, lists:seq/2 does not perform any copying. Here is actual implementation:

seq(First, Last)
    when is_integer(First), is_integer(Last), First-1 =< Last -> 
    seq_loop(Last-First+1, Last, []).

seq_loop(N, X, L) when N >= 4 ->
    seq_loop(N-4, X-4, [X-3,X-2,X-1,X|L]);
seq_loop(N, X, L) when N >= 2 ->
    seq_loop(N-2, X-2, [X-1,X|L]);
seq_loop(1, X, L) ->
    [X|L];
seq_loop(0, _, L) ->
    L.

So, it calls itself recursively, adding up to 4 elements to the accumulator in each call (third argument). Prepending elements to list does not involve any copying and the calls are tail recursive, so the stack has constant size.

EDIT: I am pasting some clarifications from comments for the sake of completeness: Erlang lists are linked lists. Internally [H|T], creates element H and adds pointer to T. T is never copied. You can do that, because T is immutable and it will never change, even if you create [H1 | T], [H2 | T] and [H3 | T] - T is shared between them.

Calling function with list as argument also doesn't involve copying :) Bigger data structures including lists are stored on the process heap. You only store pointer to the first element on the stack. Only sending message to another process would actually copy the list.

Creating very large list may be slow, because you can run out of physical memory and start using swap. END OF EDIT

About iterating, I would like to build on Nicolas Talfer answer - you sometimes want to pass arguments to function called in loop and you can use higher order functions and closures to do that:

main(_) ->
    InitialArgs = 6,
    InitialFun = fun() -> do_stuff(InitialArgs) end,
    Ret = forloop(InitialFun, 5),
    io:format("~p", [Ret]).

forloop(Fun, 1) ->
    Fun();
forloop(Fun, Count) ->
    RetVal = Fun(),
    NewFun = fun() -> do_stuff(RetVal) end,
    forloop(NewFun, Count-1).

do_stuff(Args) ->
    Args + 2.

Upvotes: 5

Nicolas Talfer
Nicolas Talfer

Reputation: 41

I assume you want to repeat an operation N times, independantly of the list content. Am I wrong? ie something like this:

[ fun() -> do_stuff() end || _X <- lists:seq(1,10000000) ].

Consider doing this instead:

foo(0) ->
    ok;
foo(Count) ->
    do_stuff(),
    foo(Count-1).

and just call foo(10000000).

Upvotes: 4

Related Questions