Reputation: 7912
I am learning scala and by attending the Odersky course online. In the third assignment there is an exercise where you have to work with a tweet set. One of the operations a tweet set should support is union
. That is, a union between two sets is a new set containing tweets from both sets.
I implemented this operations as follows:
abstract class TweetSet {
def union(that: TweetSet): TweetSet
def incl(tweet: Tweet): TweetSet
}
class Empty extends TweetSet {
def union(that:TweetSet): TweetSet = that
def incl(tweet: Tweet): TweetSet = new NonEmpty(tweet, new Empty, new Empty)
}
class NonEmpty(elem: Tweet, left: TweetSet, right: TweetSet) extends TweetSet {
def incl(x: Tweet): TweetSet = {
if (x.text < elem.text) new NonEmpty(elem, left.incl(x), right)
else if (x.text > elem.text) new NonEmpty(elem, left, right.incl(x))
else this
}
def union(that: TweetSet): TweetSet =
(left union (right union that)) incl elem //:: 1 Okay
// ((right union left) union that) incl elem //:: 2 Infinite loop
}
I can't understand why the order used to merge sets matters. This really doesn't make any sense to me. The first line in my code works properly, whereas the second line causes an infinite loop. The order of the operands of an union should not matter, and I should get the same result independently from which line I decide to use.
The order I use is the order used by Odersky in its lecture online so I can't understand why it is causing an infinite loop in certain cases.
I think there is a bug in my code, but I can't find it. Any help would be extremely appreciated.
Upvotes: 0
Views: 321
Reputation: 16308
It's actually not infinite loop. It's finite loop, but second variant involves much more operations.
Assuming
case class Tweet(text: String)
We could define simple
var unionsCalled = 0
and add line
unionsCalled += 1
to both Empty
and NonEmpty
s union implementations
then running lines
(1 to 30).view map (_.toString) map Tweet map (new NonEmpty(_, new Empty, new Empty)) reduce (_ union _)
println(unionsCalled)
gives us final unionsCalled
value of 899
for first implementation variant and 149489
for second
This is happening because of inbalanced trees.
In worst case your alorigthm to union
single item and n
item sets should create new set of n - 1
element then include elem
and then that.elem
. Which could degrade to 2^(n/2)
sometimes.
To accurately implement efficient set you should search links in this article
I dare myself to reimplement your task using simple randomized search tree which is simplest to code
import scala.util.Random
case class Tweet(text: String)
abstract class TweetSet {
def union(that: TweetSet): TweetSet
def rankInsert(tweet: Tweet, rank: Int): TweetSet
def incl(tweet: Tweet): TweetSet = rankInsert(tweet, Random.nextInt())
def toList: List[Tweet]
def depth: Int
}
case object Empty extends TweetSet {
def depth = 0
def union(that: TweetSet): TweetSet = that
def rankInsert(tweet: Tweet, rank: Int) = NonEmpty(tweet, Empty, Empty, rank)
def toList = List()
}
case class NonEmpty(elem: Tweet,
left: TweetSet,
right: TweetSet,
rank: Int) extends TweetSet {
lazy val depth = (left.depth max right.depth) + 1
def attachToParent(tweet: Tweet, rank: Int) = if (tweet.text < elem.text)
NonEmpty(tweet, Empty, this, rank)
else NonEmpty(tweet, this, Empty, rank)
def rankInsert(tweet: Tweet, rank: Int) =
if (tweet.text == elem.text) this
else if (rank > this.rank) attachToParent(tweet, rank)
else if (tweet.text < elem.text) copy(left = left.rankInsert(tweet, rank))
else copy(right = right.rankInsert(tweet, rank))
def toList = left.toList ++ (elem :: right.toList)
def union(that: TweetSet): TweetSet = if (depth > that.depth) that.union(this)
else (that /: toList)(_ incl _ )
}
object TweetSet {
def empty: TweetSet = Empty
def apply(tweets: Tweet*) = (empty /: tweets)( _ incl _ )
}
Upvotes: 2