Reputation: 495
I am a beginner in SML. I have found the following code which seems to work; however, I cannot understand how it works.
fun delete(x,[]) = []
| delete(x,y::l) = if x=y then delete(x,l) else y::delete(x,l);
fun remove_dup [] = []
| remove_dup (x::l) = x::remove_dup(delete(x,l));
This code removes duplicates in a list of any kind.
I have tried to replicate it using more basic syntax as
fun remove (vec: int list)=
if null vec
then []
else remove(hd vec::1) = hd vec::remove(delete2(tl vec, 1))
but I cannot do more.
Can someone please aid me?
Upvotes: 0
Views: 769
Reputation: 16145
... the following code which seems to work; however, I cannot understand how it works.
fun delete(x,[]) = [] | delete(x,y::l) = if x=y then delete(x,l) else y::delete(x,l); fun remove_dup [] = [] | remove_dup (x::l) = x::remove_dup(delete(x,l));
I have tried to replicate it using more basic syntax ...
That seems like a sensible thing to do.
As molbdnilo points out in a comment, hd
and tl
are not more basic.
I might format delete
like this:
fun delete (x, []) = []
| delete (x, y :: rest) =
if x = y
then delete (x, rest) (* delete y *)
else y :: delete (x, rest) (* keep y *)
One way to understand this function is to explain what it does:
It takes one tuple of (x, ...)
where x
is a single element and the second part is a list of that kind of element. Pattern matching is used here, and so the list either takes shape of []
(the empty list) or y :: rest
(the list with at least one element y
and the remaining list of elements rest
).
When the list is empty (matches the pattern []
), the result is also empty: Deleting x
from the empty list is done by simply returning the value []
. Notice that in delete (x, []) = []
, the two []
s actually mean something slightly different: The first (on the left-hand side of =
) is a pattern, and the second (on the right-hand side of =
) is a value, an actual empty list.
When the list is not empty, it means that it consists of at least one element, y
, and a tail, rest
. When there is at least one element, y
, it means there is at least one candidate for deletion. If x = y
, then the "deletion" consists of returning a list much like the input list, except with y
not mentioned anywhere. So in the then
branch, there is no mention of y
, but instead only delete (x, rest)
which means:
The result is whatever is gained from deleting x
from the rest of the list.
Note also here that a similar distinction can be made about the pattern y :: ys
and the value y :: delete (x, rest)
. Here ::
is the value constructor for non-empty lists, and it is used to create a new list value where y
persists. But the tail of that list may have copies of x
deleted.
Another way to understand this function is by running it by hand.
Imagine a call like delete (3, [1,3,3,7])
; I will make one rewrite per line, which corresponds to a reduction of the computation:
(* 1 *) delete (3, [1,3,3,7])
(* 2 *) ~> if 3 = 1 then delete (3, [3,3,7]) else 1 :: delete (3, [3,3,7])
(* 3 *) ~> 1 :: delete (3, [3,3,7])
(* 4 *) ~> 1 :: (if 3 = 3 then delete (3, [3,7]) else 3 :: delete (3, [3,7]))
(* 5 *) ~> 1 :: delete (3, [3,7])
(* 6 *) ~> 1 :: (if 3 = 3 then delete (3, [7]) else 3 :: delete (3, [7]))
(* 7 *) ~> 1 :: delete (3, [7])
(* 8 *) ~> 1 :: (if 3 = 7 then delete (3, []) else 7 :: delete (3, []))
(* 9 *) ~> 1 :: 7 :: delete (3, [])
(* 10 *) ~> 1 :: 7 :: []
This is the initial function call.
Because the input list is non-empty, we skip the first pattern match and we fit the value (3, [1,3,3,7])
onto the pattern (x, y :: rest)
: You can do that easily by setting x = 3
, y = 1
and rest = [3,3,7]
. After we do that, we replace delete (3, [1,3,3,7])
with the full function body, if ...
with the variables replaced.
Since 3 = 1
is false, the else
branch is picked, so the entire expression is replaced with the sub-part in the else
branch, 1 :: delete (3, [3,3,7])
.
Some interesting things happen here:
(* 3 *)
contains 1 :: delete (3, [3,3,7])
, you have two half-computed results: On the one hand you want to construct the list 1 :: ...
, and on the other hand, you don't know the tail delete (3, [3,3,7])
yet, since it hasn't been computed.1 :: ...
, you have to compute delete (3, [3,3,7])
. This causes for a substitution of only the tail, so 1 :: (if 3 = 3 ...)
. Besides nesting the expressions, the reduction is pretty much the same, except...Since 3 = 3
is true, the then
branch is picked, and this results in only the value delete (3, [3,7])
with no y :: ...
part, since y = 3
and we're deleting 3
s, doing 3 :: ...
would clearly be a wrong thing to do.
Things repeat this way (either including y
or not) until (* 9 *)
where 1 :: 7 :: delete (3, [])
is asking for 3
to be deleted from the empty list. For the first time, this triggers the first case of delete (x, []) = []
and the recursion stops.
You can do a similar thing with remove_dup
; running by hand, that is.
And you might find out that it isn't very trivial because it contains two recursive functions, one calling the other one and feeding its input directly. No matter how you twist and turn this function, it is going to be a little difficult.
There are some ways to make it look a little easier:
fun remove_dup [] = []
| remove_dup (x :: rest) =
let val rest_without_x = delete (x, rest)
in x :: remove_dup(rest_without_x)
end
and you may be able to at least spot that
remove_dup [] = []
: Removing duplicates from the empty list is easy.x
, remove all occurrences of x
from rest
by doing delete (x, rest)
. This list, which I called rest_without_x
, is then fed to remove_dup
.x
, we want to include it once, and the one occurrence of x
we pick is incidentally the first one we encounter.If you want to run this by hand like in the example above, you probably want to run the original, compact one, though, since the function body only spans a single line. It could look like:
remove_dup [1,1,1,2,2,3,1,2,3]
~> 1 :: remove_dup(delete(1, [1,1,2,2,3,1,2,3]))
~> 1 :: remove_dup([2,2,3,2,3])
~> 1 :: 2 :: remove_dup(delete(2, [2,3,2,3]))
~> ...
Notice that I resolve the inner expression delete(1, ...)
before I can resolve remove_dup(...)
since its argument was not yet resolved. (This is a consequence of strict evaluation semantics; some functional programming languages let you call a function without having fully evaluated its arguments!)
Try and complete the rest yourself.
Upvotes: 2