Giuseppe Pes
Giuseppe Pes

Reputation: 7912

union infinite loop scala

I am learning scala and by attending the Odersky course online. In the third assignment there is an exercise where you have to work with a tweet set. One of the operations a tweet set should support is union. That is, a union between two sets is a new set containing tweets from both sets.

I implemented this operations as follows:

abstract class TweetSet {
   def union(that: TweetSet): TweetSet
   def incl(tweet: Tweet): TweetSet
}

class Empty extends TweetSet {
 def union(that:TweetSet): TweetSet = that
 def incl(tweet: Tweet): TweetSet = new NonEmpty(tweet, new Empty, new Empty)
}

class NonEmpty(elem: Tweet, left: TweetSet, right: TweetSet) extends TweetSet {

  def incl(x: Tweet): TweetSet = {
     if (x.text < elem.text) new NonEmpty(elem, left.incl(x), right)
     else if (x.text > elem.text) new NonEmpty(elem, left, right.incl(x))
     else this
   }

   def union(that: TweetSet): TweetSet =
    (left union (right union that)) incl elem //:: 1 Okay
   // ((right union left) union that) incl elem //:: 2 Infinite loop
}

I can't understand why the order used to merge sets matters. This really doesn't make any sense to me. The first line in my code works properly, whereas the second line causes an infinite loop. The order of the operands of an union should not matter, and I should get the same result independently from which line I decide to use.

The order I use is the order used by Odersky in its lecture online so I can't understand why it is causing an infinite loop in certain cases.

I think there is a bug in my code, but I can't find it. Any help would be extremely appreciated.

Upvotes: 0

Views: 321

Answers (1)

Odomontois
Odomontois

Reputation: 16308

It's actually not infinite loop. It's finite loop, but second variant involves much more operations.

Assuming

case class Tweet(text: String)

We could define simple

var unionsCalled = 0

and add line

unionsCalled += 1

to both Empty and NonEmptys union implementations

then running lines

(1 to 30).view map (_.toString) map Tweet map (new NonEmpty(_, new Empty, new Empty)) reduce (_ union _)
println(unionsCalled)

gives us final unionsCalled value of 899 for first implementation variant and 149489 for second

This is happening because of inbalanced trees. In worst case your alorigthm to union single item and n item sets should create new set of n - 1 element then include elem and then that.elem. Which could degrade to 2^(n/2) sometimes.

To accurately implement efficient set you should search links in this article

I dare myself to reimplement your task using simple randomized search tree which is simplest to code

import scala.util.Random

case class Tweet(text: String)

abstract class TweetSet {
  def union(that: TweetSet): TweetSet
  def rankInsert(tweet: Tweet, rank: Int): TweetSet
  def incl(tweet: Tweet): TweetSet = rankInsert(tweet, Random.nextInt())
  def toList: List[Tweet]
  def depth: Int
}

case object Empty extends TweetSet {
  def depth = 0
  def union(that: TweetSet): TweetSet = that
  def rankInsert(tweet: Tweet, rank: Int) = NonEmpty(tweet, Empty, Empty, rank)
  def toList = List()
}

case class NonEmpty(elem: Tweet, 
                    left: TweetSet, 
                    right: TweetSet, 
                    rank: Int) extends TweetSet {
  lazy val depth = (left.depth max right.depth) + 1
  def attachToParent(tweet: Tweet, rank: Int) = if (tweet.text < elem.text)
    NonEmpty(tweet, Empty, this, rank)
  else NonEmpty(tweet, this, Empty, rank)

  def rankInsert(tweet: Tweet, rank: Int) =
    if (tweet.text == elem.text) this
    else if (rank > this.rank) attachToParent(tweet, rank)
    else if (tweet.text < elem.text) copy(left = left.rankInsert(tweet, rank))
    else copy(right = right.rankInsert(tweet, rank))

  def toList = left.toList ++ (elem :: right.toList)
  def union(that: TweetSet): TweetSet = if (depth > that.depth) that.union(this)
  else (that /: toList)(_ incl _ )
}

object TweetSet {
  def empty: TweetSet = Empty
  def apply(tweets: Tweet*) = (empty /:  tweets)( _ incl _ )
}

Upvotes: 2

Related Questions