Preza8
Preza8

Reputation: 399

For loop in list

Poeple often use

for i in [0 .. 10] do something

but afaik that creates a list which is then iterated through, it appears to me it would make more sense to use

for i = 0 to 10 do something

without creating that unnecessary list but having the same behaviour. Am I missing something? (I guess that's the case)

Upvotes: 4

Views: 746

Answers (3)

Giving a somewhat different kind of answer but hopefully interesting to some

You are correct in that the F# compiler fails to apply the fast-for-loop optimization in this case. Good news, the F# compiler is open source and it's possible for us to improve upon it's behavior.

So here's a freebie from me:

fast-for-loop optimization happens in tastops.fs. It's rather primitive at the moment, great opportunity for us to improve upon.

// Detect the compiled or optimized form of a 'for <elemVar> in <startExpr> .. <finishExpr>  do <bodyExpr>' expression over integers
// Detect the compiled or optimized form of a 'for <elemVar> in <startExpr> .. <step> .. <finishExpr>  do <bodyExpr>' expression over integers when step is positive
let (|CompiledInt32ForEachExprWithKnownStep|_|) g expr = 
    match expr with 
    | Let (_enumerableVar, RangeInt32Step g (startExpr, step, finishExpr), _, 
           Let (_enumeratorVar, _getEnumExpr, spBind,
              TryFinally (WhileLoopForCompiledForEachExpr (_guardExpr, Let (elemVar,_currentExpr,_,bodyExpr), m), _cleanupExpr))) -> 

        let spForLoop = match spBind with SequencePointAtBinding(spStart) -> SequencePointAtForLoop(spStart) |  _ -> NoSequencePointAtForLoop 

        Some(spForLoop,elemVar,startExpr,step,finishExpr,bodyExpr,m)
    | _ -> 
        None

let DetectFastIntegerForLoops g expr = 
    match expr with 
    | CompiledInt32ForEachExprWithKnownStep g (spForLoop,elemVar,startExpr,step,finishExpr,bodyExpr,m) 
         // fast for loops only allow steps 1 and -1 steps at the moment
         when step = 1 || step = -1 -> 
            mkFastForLoop  g (spForLoop,m,elemVar,startExpr,(step = 1),finishExpr,bodyExpr)
    | _ -> expr

The problem here is that RangeInt32Step only detects patterns like 0..10 and 0..1..10. It misses for instance [0..10]

Let's introduce another active pattern SeqRangeInt32Step that matches these kind of expressions:

let (|SeqRangeInt32Step|_|) g expr =
    match expr with
    // detect '[n .. m]'
    | Expr.App(Expr.Val(toList,_,_),_,[TType_var _],
                [Expr.App(Expr.Val(seq,_,_),_,[TType_var _],
                          [Expr.Op(TOp.Coerce, [TType_app (seqT, [TType_var _]); TType_var _],
                                    [RangeInt32Step g (startExpr, step, finishExpr)], _)],_)],_)
        when
            valRefEq g toList (ValRefForIntrinsic g.seq_to_list_info) &&
            valRefEq g seq g.seq_vref &&
            tyconRefEq g seqT g.seq_tcr ->
            Some(startExpr, step, finishExpr)

    | _ -> None

How do you figure out that this is what you need to pattern match for? The approach I often take is that I do a simple F# program with the right properties and put a breakpoint during compilation to inspect the expression. From that I create the pattern to match for:

Let's put the two patterns together:

let (|ExtractInt32Range|_|) g expr =
  match expr with
  | RangeInt32Step g range -> Some range
  | SeqRangeInt32Step g range -> Some range
  | _ -> None

CompiledInt32ForEachExprWithKnownStep is updated to use ExtractInt32Range over RangeInt32Step

The complete solution would be something like this:

let (|SeqRangeInt32Step|_|) g expr =
    match expr with
    // detect '[n .. m]'
    | Expr.App(Expr.Val(toList,_,_),_,[TType_var _],
                [Expr.App(Expr.Val(seq,_,_),_,[TType_var _],
                          [Expr.Op(TOp.Coerce, [TType_app (seqT, [TType_var _]); TType_var _],
                                    [RangeInt32Step g (startExpr, step, finishExpr)], _)],_)],_)
        when
            valRefEq g toList (ValRefForIntrinsic g.seq_to_list_info) &&
            valRefEq g seq g.seq_vref &&
            tyconRefEq g seqT g.seq_tcr ->
            Some(startExpr, step, finishExpr)

    | _ -> None

let (|ExtractInt32Range|_|) g expr =
  match expr with
  | RangeInt32Step g range -> Some range
  | SeqRangeInt32Step g range -> Some range
  | _ -> None

// Detect the compiled or optimized form of a 'for <elemVar> in <startExpr> .. <finishExpr>  do <bodyExpr>' expression over integers
// Detect the compiled or optimized form of a 'for <elemVar> in <startExpr> .. <step> .. <finishExpr>  do <bodyExpr>' expression over integers when step is positive
let (|CompiledInt32ForEachExprWithKnownStep|_|) g expr = 
    match expr with 
    | Let (_enumerableVar, ExtractInt32Range g (startExpr, step, finishExpr), _,
           Let (_enumeratorVar, _getEnumExpr, spBind,
              TryFinally (WhileLoopForCompiledForEachExpr (_guardExpr, Let (elemVar,_currentExpr,_,bodyExpr), m), _cleanupExpr))) -> 

        let spForLoop = match spBind with SequencePointAtBinding(spStart) -> SequencePointAtForLoop(spStart) |  _ -> NoSequencePointAtForLoop 

        Some(spForLoop,elemVar,startExpr,step,finishExpr,bodyExpr,m)
    | _ -> 
        None

Using a simple test program

let print v =
    printfn "%A" v

[<EntryPoint>]
let main argv =
    for x in [0..10] do
        print x

    0

Before the optimization the corresponding C# code would look something like this (IL code is better to inspect but can be a bit hard to understand if one is unused to it):

// Test
[EntryPoint]
public static int main(string[] argv)
{
    FSharpList<int> fSharpList = SeqModule.ToList<int>(Operators.CreateSequence<int>(Operators.OperatorIntrinsics.RangeInt32(0, 1, 10)));
    IEnumerator<int> enumerator = ((IEnumerable<int>)fSharpList).GetEnumerator();
    try
    {
        while (enumerator.MoveNext())
        {
            Test.print<int>(enumerator.Current);
        }
    }
    finally
    {
        IDisposable disposable = enumerator as IDisposable;
        if (disposable != null)
        {
            disposable.Dispose();
        }
    }
    return 0;
}

F# creates a list and then uses the enumerator to iterate over it. No wonder it's rather slow compared to a classic for-loop.

After the optimization is applied we get this code:

// Test
[EntryPoint]
public static int main(string[] argv)
{
    for (int i = 0; i < 11; i++)
    {
        Test.print<int>(i);
    }
    return 0;
}

A significant improvement.

So steal this code, post a PR to https://github.com/Microsoft/visualfsharp/ and bask in glory. Of course you need to add unit tests and emitted IL code tests which can be somewhat tricky to find the right level for, check this commit for inspiration

PS. Probably should support [|0..10|] as well seq {0..10} as well

PS. In addition for v in 0L..10L do print v as well as for v in 0..2..10 do print v is also inefficiently implemented in F#.

Upvotes: 7

Michel Billaud
Michel Billaud

Reputation: 1826

The former form requires a special construct in the language (for var from ... to ... by), it is the way followed by ancient programming languages :

  • 'do' loop in Fortran
  • for var:= expr to expr in Pascal
  • etc.

The latter form (for var in something) is more général. It works on plain lists, but also with generators (like in python) etc. A construction of the full list may not be needed before running the list. This allows to write loops on potentially infinite lists.

Anyway, a decent compiler/interpreter should recognize the rather frequent special case [expr1..expr2] and avoid the computation and storage of the intermediate list.

Upvotes: 0

Tomas Petricek
Tomas Petricek

Reputation: 243041

You are correct, writing for i in [0 .. 10] do something generates a list and it does have a significant overhead. Though you can also omit the square brackets, in which case it just builds a lazy sequence (and, it turns out that the compiler even optimizes that case). I generally prefer writing in 0 .. 100 do because it looks the same as code that iterates over a sequence.

Using the #time feature of F# interactive to do a simple analysis:

for i in [ 0 .. 10000000 ] do // 3194ms (yikes!)
  last <- i

for i in 0 .. 10000000 do     // 3ms
  last <- i

for i = 0 to 10000000 do      // 3ms
  last <- i

for i in seq { 0 .. 10000000 } do // 709ms (smaller yikes!)
  last <- i

So, it turns out that the compiler actually optimizes the in 0 .. 10000000 do into the same thing as the 0 to 10000000 do loop. You can force it to create the lazy sequence explicitly (last case) which is faster than a list, but still very slow.

Upvotes: 13

Related Questions