Worice
Worice

Reputation: 4037

Pure pattern matching

I am building a function that counts of many times a character appears in a string after the nth position.

countCh ("aaabbbccc", 3, 'b')
val it: int = 2

In C, I would use an accumulator with a while loop. But I am trying to learn the F# functional face, where this approach is discouraged. So I used guards to test few conditions and build the function:

let rec countCh (s:string, n:int, ch:char) =
match s, n, ch with
   | (s, n, ch) when n > s.Length -> 0                          //p1
   | (s, n, ch) when n < 0 -> 0                                 //p2
   | (s, n, ch) when s.[n] <> ch -> countCh(s, n + 1, ch)       //p3
   | (s, n, ch) when s.[n] = ch -> 1 + countCh(s, n + 1, ch)    //p4

The coexistence of patterns 3 and 4 is problematic (impossible, I am afraid). Even if it compiles, I have not been able to make it work. How can this task functionally be handled?

Upvotes: 2

Views: 104

Answers (3)

Anton Schwaighofer
Anton Schwaighofer

Reputation: 3149

I'd suggest to not write it yourself, but ask the library functions for help:

let countCh (s: string, n, c) = 
    s.Substring(n+1).ToCharArray()
    |> Seq.filter ((=) c)
    |> Seq.length

Or use Seq.skip, along with the fact that you can drop the conversion to character array:

let countCh (s: string, n, c) = 
    s
    |> Seq.skip (n + 1)
    |> Seq.filter ((=) c)
    |> Seq.length

Upvotes: 2

Stuart
Stuart

Reputation: 5496

I agree with the other answers, but I'd like to help you with your original question. You need to indent the function, and you have an off by one bug:

let rec countCh (s:string, n:int, ch:char) =
    match s, n, ch with
        | s, n,  _ when n >= s.Length-1 -> 0                        //p1
        | s, _,  _ when n < 0 -> 0                                  //p2
        | s, n, ch when s.[n+1] <> ch -> countCh(s, n+2, ch)        //p3
        | s, n, ch when s.[n+1] = ch -> 1 + countCh(s, n+2, ch)     //p4

Upvotes: 2

Fyodor Soikin
Fyodor Soikin

Reputation: 80744

First, the coexistence of these branches is not problematic. They don't conflict with each other. Why do you think that it's problematic? Is it because you get an "Incomplete pattern match" compiler warning? That warning does not tell you that the branches conflict, it tells you that the compiler can't prove that the four branches cover all possibilities. Or do you think that for some other reason? If you want your questions to be answered accurately, you'll have to ask them more clearly.

Second, you're abusing the pattern matching. Look: there are no patterns! The patterns in every branch are exactly the same, and trivial. Only guards are different. This looks very counterintuitively within a match, but would be plainly expressed with if..elif:

let rec countCh (s:string) n ch =
   if n >= s.Length || n < 0 then 0
   elif s.[n] = ch then 1 + countCh s (n + 1) ch
   else countCh s (n + 1) ch

NOTE 1: see how I made the parameters curried? Always use curried form, unless there is a very strong reason to use tupled. Curried parameters are much more convenient to use on the caller side.

NOTE 2: your condition n > s.Length was incorrect: string indices go from 0 to s.Length-1, so the bail condition should be n >= s.Length. It is corrected in my code.

Finally, since this is an exercise, I must point out that the recursion is not tail recursion. Look at the second branch (in my code): it calls the function recursively and then adds one to the result. Since you have to do something with the result of the recursive call, the recursion can't be "tail". This means you risk stack overflow on very long inputs.

To make this into tail recursion, you need to turn the function "inside out", so to say. Instead of returning the result from every call, you need to pass it into every call (aka "accumulator"), and only return from the terminal case:

let rec countCh (s:string) n ch countSoFar =
   if n >= s.Length || n < 0 then countSoFar
   elif s.[n] = ch then countCh s (n+1) ch (countSoFar+1)
   else countCh s (n+1) ch countSoFar

// Usage:
countCh "aaaabbbccc" 5 'b' 0

This way, every recursive call is the "last" call (i.e. the function doesn't do anything with the result, but passes it straight out to its own caller). This is called "tail recursion" and can be compiled to work in constant stack space (as opposed to linear).

Upvotes: 4

Related Questions