Cedric Royer-Bertrand
Cedric Royer-Bertrand

Reputation: 741

F# - Algorithm and strings

Let's says I have a string of a length N that contains only 0 or 1. I want to split that string in multiples strings and each string should contains only one digit.

Example:

00011010111

Should be split into:

  • 000
  • 11
  • 0
  • 1
  • 0
  • 111

The only solution I can think of if using a for loop with a string builder (Written in pseudo code below, more c# like sorry):

 result = new list<string>
 tmpChar = ""
 tmpString = ""
 for each character c in MyString
     if tmpchar != c
         if tmpsString != ""
             result.add tmpString
             tmpString = ""
         endIf
         tmpchar = c
     endIf
     tmpString += tmpChar
 endFor

Do you have any other solution and maybe a clever solution that use a more functional approach?

Upvotes: 0

Views: 150

Answers (5)

Brett Rowberry
Brett Rowberry

Reputation: 1050

Here's a working version of OP's proposal with light syntax:

let chunk (s: string) =
  let result = System.Collections.Generic.List<string>()
  let mutable tmpChar = ""
  let mutable tmpString = ""
  for c in s do
    if tmpChar <> string c then
      if tmpString <> "" then
        result.Add tmpString
        tmpString <- ""
      tmpChar <- string c
    tmpString <- tmpString + tmpChar
  result.Add tmpString
  result

No attempt was made to follow a functional style.

Upvotes: 0

Rodrick Chapman
Rodrick Chapman

Reputation: 5543

Consider this function (which is generic):

let chunk s =
if Seq.isEmpty s then []
else
    let rec chunk items chunks =
        if Seq.isEmpty items then chunks
        else             
            let chunks' =
                match chunks with
                | [] -> [(Seq.head items, 1)]
                | x::xs ->               
                    let c,n = x in let c' = Seq.head items in
                    if c = c' then (c, n + 1) :: xs else (c', 1) :: x :: xs

            chunk (Seq.tail items) chunks'

    chunk s [] |> List.rev

It returns a list of tuples, where each tuple represents an item and its repetitions.

So

"00011010111" |> Seq.toList |> chunk 

actually returns

[('0', 3); ('1', 2); ('0', 1); ('1', 1); ('0', 1); ('1', 3)]

Basically, we're doing run length encoding (which is admittedly a bit wasteful in the case of your example string).

To get the list of strings that you want, we use code like following:

"00011010111" 
|> Seq.toList 
|> chunk 
|> List.map (fun x -> let c,n = x in new string(c, n))

Upvotes: 0

J&#252;rgen H&#246;tzel
J&#252;rgen H&#246;tzel

Reputation: 19757

Seq.fold (fun (acc:(string list)) x ->
          match acc with
          | y::rst when y.StartsWith(string x) -> (string x) + y::rst
          | _ -> (string x)::acc)
        []
        "00011010111"

Upvotes: 2

Nathan Wilson
Nathan Wilson

Reputation: 639

Perhaps something along the lines of:

let result = 
  let rec groupWhileSame xs result =
    match xs with 
    | a when a |> Seq.isEmpty -> result
    | _ -> 
      let head = xs |> Seq.head
      let str = xs |> Seq.takeWhile ((=) head)
      let rest = xs |> Seq.skipWhile ((=) head)
      groupWhileSame rest (Seq.append result [str])

  groupWhileSame (myStr) []

Upvotes: 3

JosephStevens
JosephStevens

Reputation: 1706

I think Seq.scan would be a good fit for this, this is a very procedural problem in nature, preserving the order like that. But here is code that I believe does what you are asking.

"00011010111"
|> Seq.scan (fun (s, i) x ->
    match s with
    | Some p when p = x -> Some x, i
    | _ -> Some x, i + 1 ) (None, 0)
|> Seq.countBy id
|> Seq.choose (function 
| (Some t, _), n -> Some(t, n)
| _ -> None )
|> Seq.toList

Upvotes: 3

Related Questions