colinfang
colinfang

Reputation: 21727

How smart is pattern match?

My program spends most of time on array pattern match, I am wondering if I should rewrite the function and discard the auto pattern matching.

E.g. a very simple case

let categorize array =
    match array with
    | [|(1|2);(1|2);(1|2)|] -> 3
    | [|(1|2);(1|2);_|] -> 2
    | [|(1|2);_;_|] -> 1
    | _ -> 0

categorize [|2;1;3|]

Would the compiler apply the least amount of comparisons in this case, by recognizing that e.g. the first case is the same as the second case except for the third element.

Actually the patterns are more complicated, the pre optimized pattern matching could cost way more time than fully optimized pattern matching.

Upvotes: 3

Views: 473

Answers (4)

colinfang
colinfang

Reputation: 21727

Test 1

    F#
        let test1 x =
            match x with
            | [| 1; 2; 3 |] -> A
            | [| 1; 2; _ |] -> A
            | [| 1; _; _ |] -> A
    Decompiled C#
        if (x != null && x.Length == 3)
        {
            switch (x[0])
            {
            case 1:
                switch (x[1])
                {
                case 2:
                    switch (x[2])
                    {
                    case 3:
                        return Program.MyType.A;
                    default:
                        return Program.MyType.A;
                    }
                    break;
                default:
                    return Program.MyType.A;
                }
                break;
            }
        }
        throw new MatchFailureException(...);
    Decompiled IL
        Code size 107

Conclusion

  1. Pattern Match doesn't optimize based on the values after ->.
  2. Pattern Match is able to find the optimized approach for array decomposition under conclusion 1.
  3. Incomplete pattern matches always throw exceptions, so there is no harm to add a wildcard to catch the missing patterns and throw exceptions explicitly.

Test 2

    F#
        let test2 x =
            match x with
            | [| 1; 2; 3 |] -> A
            | [| _; 2; 3 |] -> B
            | [| _; _; 3 |] -> C
    Decompiled C#
        if (x != null && x.Length == 3)
        {
            switch (x[0])
            {
            case 1:
                switch (x[1])
                {
                case 2:
                    switch (x[2])
                    {
                    case 3:
                        return Program.MyType.A;
                    default:
                        goto IL_49;
                    }
                    break;
                default:
                    switch (x[2])
                    {
                    case 3:
                        break;
                    default:
                        goto IL_49;
                    }
                    break;
                }
                break;
            default:
                switch (x[1])
                {
                case 2:
                    switch (x[2])
                    {
                    case 3:
                        return Program.MyType.B;
                    default:
                        goto IL_49;
                    }
                    break;
                default:
                    switch (x[2])
                    {
                    case 3:
                        goto IL_58;
                    }
                    goto IL_49;
                }
                break;
            }
            IL_58:
            return Program.MyType.C;
        }
        IL_49:
        throw new MatchFailureException(...);
    Decompiled IL
        Code size 185

Conclusion

  1. Pattern Match checks values from the beginning of an array to end. So it fails to find the optimized approach.
  2. Code size is 2x as much as an optimal one.

Test 3

    F#
        let test3 x =
            match x with
            | [| 1; 2; 3 |] -> A
            | [| 1; 2; a |] when a <> 3 -> B
            | [| 1; 2; _ |] -> C
    Decompiled C#
        if (x != null && x.Length == 3)
        {
            switch (x[0])
            {
            case 1:
                switch (x[1])
                {
                case 2:
                    switch (x[2])
                    {
                    case 3:
                        return Program.MyType.A;
                    default:
                        if (x[2] != 3)
                        {
                            int a = x[2];
                            return Program.MyType.B;
                        }
                        break;
                    }
                    break;
                }
                break;
            }
        }
        if (x != null && x.Length == 3)
        {
            switch (x[0])
            {
            case 1:
                switch (x[1])
                {
                case 2:
                    return Program.MyType.C;
                }
                break;
            }
        }
        throw new MatchFailureException(...);

Conclusion

  1. The compiler isn't smart enough to see through Guard to check completeness/duplicity.
  2. Guard makes Pattern Match produce weird unoptimized code.

Test 4

    F#
        let (| Is3 | IsNot3 |) x =
           if x = 3 then Is3 else IsNot3
        let test4 x =
            match x with
            | [| 1; 2; 3      |] -> A
            | [| 1; 2; Is3    |] -> B
            | [| 1; 2; IsNot3 |] -> C
            | [| 1; 2; _      |] -> D // This rule will never be matched.
    Decompiled C#
        if (x != null && x.Length == 3)
        {
            switch (x[0])
            {
            case 1:
                switch (x[1])
                {
                case 2:
                    switch (x[2])
                    {
                    case 3:
                        return Program.MyType.A;
                    default:
                    {
                        FSharpChoice<Unit, Unit> fSharpChoice = Program.|Is3|IsNot3|(x[2]);
                        if (fSharpChoice is FSharpChoice<Unit, Unit>.Choice2Of2)
                        {
                            return Program.MyType.C;
                        }
                        return Program.MyType.B;
                    }
                    }
                    break;
                }
                break;
            }
        }
        throw new MatchFailureException(...);

Conclusion

  1. Multiple cases Active Patterns compile to FSharpChoice.
  2. The compiler is able to check completeness/duplicity of active patterns, however it cannot compare them with normal patterns.
  3. Unreached patterns are not compiled.

Test 5

    F#
        let (| Equal3 |) x =
           if x = 3 then Equal3 1 else Equal3 0 // Equivalent to "then 1 else 0"
        let test5 x =
            match x with
            | [| 1; 2; 3        |] -> A
            | [| 1; 2; Equal3 0 |] -> B
            | [| 1; 2; Equal3 1 |] -> C
            | [| 1; 2; _        |] -> D
    Decompiled C#
        if (x != null && x.Length == 3)
        {
            switch (x[0])
            {
            case 1:
                switch (x[1])
                {
                case 2:
                    switch (x[2])
                    {
                    case 3:
                        return Program.MyType.A;
                    default:
                    {
                        int num = x[2];
                        switch ((num != 3) ? 0 : 1)
                        {
                        case 0:
                            return Program.MyType.B;
                        case 1:
                            return Program.MyType.C;
                        default:
                            return Program.MyType.D;
                        }
                        break;
                    }
                    }
                    break;
                }
                break;
            }
        }
        throw new MatchFailureException(...);

Conclusion

  1. Single case Active Patterns compile to the return type.
  2. The compiler sometimes auto inline the function.

Test 6

    F#
        let (| Partial3 | _ |) x =
           if x = 3 then Some (Partial3 true) else None // Equivalent to "then Some true"
        let test6 x =
            match x with
            | [| 1; 2; 3 |] -> A
            | [| 1; 2; Partial3 true |] -> B
            | [| 1; 2; Partial3 true |] -> C
    Decompiled C#
        if (x != null && x.Length == 3)
        {
            switch (x[0])
            {
            case 1:
                switch (x[1])
                {
                case 2:
                    switch (x[2])
                    {
                    case 3:
                        return Program.MyType.A;
                    default:
                    {
                        FSharpOption<bool> fSharpOption = Program.|Partial3|_|(x[2]);
                        if (fSharpOption != null && fSharpOption.Value)
                        {
                            return Program.MyType.B;
                        }
                        break;
                    }
                    }
                    break;
                }
                break;
            }
        }
        if (x != null && x.Length == 3)
        {
            switch (x[0])
            {
            case 1:
                switch (x[1])
                {
                case 2:
                {
                    FSharpOption<bool> fSharpOption = Program.|Partial3|_|(x[2]);
                    if (fSharpOption != null && fSharpOption.Value)
                    {
                        return Program.MyType.C;
                    }
                    break;
                }
                }
                break;
            }
        }
        throw new MatchFailureException(...);

Conclusion

  1. Partial Active Patterns compile to FSharpOption.
  2. The compiler is unable to check completeness/duplicity of partial active patterns.

Test 7

    F#
        type MyOne =
            | AA
            | BB of int
            | CC
        type MyAnother =
            | AAA
            | BBB of int
            | CCC
            | DDD
        let test7a x =
            match x with
            | AA -> 2
        let test7b x =
            match x with
            | AAA -> 2
    Decompiled C#
        public static int test7a(Program.MyOne x)
        {
            if (x is Program.MyOne._AA)
            {
                return 2;
            }
            throw new MatchFailureException(...);
        }
        public static int test7b(Program.MyAnother x)
        {
            if (x.Tag == 0)
            {
                return 2;
            }
            throw new MatchFailureException(...);
        }

Conclusion

  1. If there are more than 3 cases in the union, Pattern Match would use Tag property instead of is. (It also applies to Multiple cases Active Patterns.)
  2. Often a Pattern Match would result in multiple is which degenerate performance greatly.

Upvotes: 0

Be Brave Be Like Ukraine
Be Brave Be Like Ukraine

Reputation: 7735

What is really missing in your question is the actual subject area. In other words, your question is quite generic (which is, generally, good for SO), while coding against on your actual problem may solve the entire issue in an elegant manner.

If I extrapolate your question as it currently stands, you just need the index of the first element which is neither 1 nor 2, and the implementation is trivial:

let categorize arr =
    try
        Array.findIndex (fun x -> not(x = 1 || x = 2)) arr
    with
        | :? System.Collections.Generic.KeyNotFoundException -> Array.length arr

// Usage
let x1 = categorize [|2;1;3|]    // returns 2
let x2 = categorize [|4;2;1;3|]  // returns 0
let x3 = categorize [|1;2;1|]    // returns 3

As several free benefits, you get the code that is array length-agnostic and absolutely readable.
Is this what you need?

Upvotes: 1

J D
J D

Reputation: 48687

You could write:

let f (xs: _ []) =
  if xs.Length=3 then
    let p n = n=1 || n=2
    if p xs.[0] then
      if p xs.[1] then
        if p xs.[2] then 3
      else 2
    else 1
  else 0

Upvotes: 0

Daniel
Daniel

Reputation: 47904

Straight from Reflector:

public static int categorize(int[] array)
{
    if ((array > null) && (array.Length == 3))
    {
        switch (array[0])
        {
            case 1:
                switch (array[1])
                {
                    case 1:
                        switch (array[2])
                        {
                            case 1:
                            case 2:
                                goto Label_005C;
                        }
                        goto Label_005A;

                    case 2:
                        switch (array[2])
                        {
                            case 1:
                            case 2:
                                goto Label_005C;
                        }
                        goto Label_005A;
                }
                goto Label_0042;

            case 2:
                switch (array[1])
                {
                    case 1:
                        switch (array[2])
                        {
                            case 1:
                            case 2:
                                goto Label_005C;
                        }
                        goto Label_005A;

                    case 2:
                        switch (array[2])
                        {
                            case 1:
                            case 2:
                                goto Label_005C;
                        }
                        goto Label_005A;
                }
                goto Label_0042;
        }
    }
    return 0;
Label_0042:
    return 1;
Label_005A:
    return 2;
Label_005C:
    return 3;
}

I don't see anything inefficient.

Upvotes: 6

Related Questions