Reputation: 181
In OCaml, suppose I have a string list as follows:
let ls : string list = ["A"; "A"; "B"; "B"; "A"; "Y"; "Y"; "Y"] ;;
I'm having trouble writing a function that calculates how many times an element occurs consecutively and also pairs up that element with its frequency. For instance, given the above list as input, the function should return [("A", 2); ("B", 2), ("A", 1), ("Y", 3)]
.
I've tried looking for some hints elsewhere but almost all other similar operations are done using int list
s, where it is easy to simply add numbers up. But here, we cannot add strings.
My intuition was to use something like fold_left
in some similar fashion as below:
let lis : int list = [1;2;3;4;5]
let count (lis : int list) = List.fold_left (fun x y -> x + y) (0) (lis)
which is essentially summing all the elements cumulatively from left to right. But, in my case, I don't want to cumulatively sum all the elements, I just need to count how many times an element occurs consecutively. Some advice would be appreciated!
Upvotes: 0
Views: 975
Reputation: 36496
You're doing run length encoding. Essentially you need to iterate over the list and if you find consecutive elements, increment a count. This becomes easier if we use an accumulator. For efficiency we build it up backwards and then reverse it on reaching the end of the list.
But taking this step for efficiency also means it's very easy to access the last element we added on.
We need to worry about three cases: when the list is empty we return the accumulator (reversed). If both the list and the accumulator are not empty, we compare the current element x
with the top tuple in the result (y, c)
. If x
and y
are equal, we need to recursively call, updating that top tuple on the list. If they're not equal, we just need to add a tuple to the result with (x, 1)
.
# let rec rle lst acc =
match lst, acc with
| [], _ -> List.rev acc
| x::xs, (y, c)::ys when x = y -> rle xs @@ (y, c + 1) :: ys
| x::xs, _ -> rle xs @@ (x, 1) :: acc;;
val rle : 'a list -> ('a * int) list -> ('a * int) list = <fun>
# rle ["A"; "A"; "B"; "B"; "A"; "Y"; "Y"; "Y"] [];;
- : (string * int) list = [("A", 2); ("B", 2); ("A", 1); ("Y", 3)]
If we look at how this works for your example:
rle ["A"; "A"; "B"; "B"; "A"; "Y"; "Y"; "Y"] []
rle ["A"; "B"; "B"; "A"; "Y"; "Y"; "Y"] [("A", 1)]
rle ["B"; "B"; "A"; "Y"; "Y"; "Y"] [("A", 2)]
rle ["B"; "A"; "Y"; "Y"; "Y"] [("B", 1); ("A", 2)]
rle ["A"; "Y"; "Y"; "Y"] [("B", 2); ("A", 2)]
rle ["Y"; "Y"; "Y"] [("A", 1); ("B", 2); ("A", 2)]
rle ["Y"; "Y"] [("Y", 1); ("A", 1); ("B", 2); ("A", 2)]
rle ["Y"] [("Y", 2); ("A", 1); ("B", 2); ("A", 2)]
rle [] [("Y", 3); ("A", 1); ("B", 2); ("A", 2)]
List.rev [("Y", 3); ("A", 1); ("B", 2); ("A", 2)]
[("A", 2); ("B", 2); ("A", 1); ("Y", 3)]
Taking this a step further you might use sequences to generate the runtime encoding lazily on not just lists but anything that can be converted to a sequence.
let rle_seq seq =
let rec aux ?(acc=None) seq () =
Seq.(
match seq (), acc with
| Nil, None -> Nil
| Nil, Some acc' ->
Cons (acc', empty)
| Cons (x, seq'), None ->
aux ~acc: (Some (x, 1)) seq' ()
| Cons (x, seq'), Some (last, count) when x = last ->
aux ~acc: (Some (last, count+1)) seq' ()
| Cons (x, seq'), Some acc' ->
Cons (acc', aux ~acc: (Some (x, 1)) seq')
)
in
aux seq
# [|1; 3; 3; 5; 7; 8; 9; 1; 1; 1; 2|]
|> Array.to_seq
|> rle_seq
|> List.of_seq;;
- : (int * int) list =
[(1, 1); (3, 2); (5, 1); (7, 1); (8, 1); (9, 1); (1, 3); (2, 1)]
Upvotes: 0
Reputation: 66818
This is obviously a homework assignment, so I will just give a couple of hints.
When you get your code working, it won't be adding strings (or any other type) together. It will be adding ints together. So you might want to look back at those examples on the net again :-)
You can definitely use fold_left
to get an answer. First, note that the resultl is a list of pairs. The first element of each pair can be any type, depending on the type of the original list. The second element in each pair is an int. So you have a basic type that you're working with: ('a * int) list
.
Imagine that you have a way to "increment" such a list:
let increment (list: ('a * int) list) value =
(* This is one way to think of your problem *)
This function looks for the pair in the list whose first element is equal to value. If it finds it, it returns a new list where the associated int is one larger than before. If it doesn't find it, it returns a new list with an extra element (value, 1)
.
This is the basic operation you want to fold over your list, rather than the +
operation of your example code.
Upvotes: 0