Bruce Ferguson
Bruce Ferguson

Reputation: 1861

Parallel collections in Scala 2.9 and Actors

Ok, this might be a rather silly question, but what is the benefit of using parallel collections within an actor framework? That is, if I'm only dealing with one message at a time from an actor's mailbox, is there even a need for a parallel collection? Are parallel collections and actors mutually exclusive? What is a use case that would involve both?

Upvotes: 11

Views: 1656

Answers (2)

Johan Prinsloo
Johan Prinsloo

Reputation: 1188

They solve different problems . Actors are good at solving task parallel problems. While parallel collections are good at solving data parallel problems. I don't think they are mutually exclusive - you can use parallel collections in actors and parallel collections containing actors.


Edit - quick test: Even something simple like a actor notification loop benefits.

In the following code we register a million actors with an actor registry which has to notify them of an event.

The non-parallel notification loop ( registry foreach {} ) takes an average of 2.8 seconds on my machine (4 core 2.5 GHz notebook). When the parallel collection loop ( registry.par.foreach {} ) is used it takes 1.2 seconds and uses all four cores.

import actors.Actor

case class Register(actor: Actor)
case class Unregister(actor: Actor)
case class Message( contents: String )

object ActorRegistry extends Actor{
  var registry: Set[Actor] = Set.empty

  def act() {
    loop{
      react{
        case reg: Register => register( reg.actor )
        case unreg: Unregister => unregister( unreg.actor )
        case message: Message => fire( message )
      }
    }
  }

  def register(reg: Actor) { registry += reg }

  def unregister(unreg: Actor) { registry -= unreg }

  def fire(msg: Message){
    val starttime = System.currentTimeMillis()

    registry.par.foreach { client => client ! msg } //swap registry foreach for single th

    val endtime = System.currentTimeMillis()
    println("elapsed: " + (endtime - starttime) + " ms")
  }
}

class Client(id: Long) extends Actor{
  var lastmsg = ""
  def act() {
    loop{
      react{
        case msg: Message => got(msg.contents)
      }
    }
  }
  def got(msg: String) {
    lastmsg = msg
  }
}

object Main extends App {

  ActorRegistry.start
  for (i <- 1 to 1000000) {
    var client = new Client(i)
    client.start
    ActorRegistry ! Register( client )
  }

  ActorRegistry ! Message("One")

  Thread.sleep(6000)

  ActorRegistry ! Message("Two")

  Thread.sleep(6000)

  ActorRegistry ! Message("Three")

}

Upvotes: 15

Vasil Remeniuk
Vasil Remeniuk

Reputation: 20627

Actors library in Scala is just one of the options, approaches to concurrency, among many (threads & locks, STM, futures/promises), and it's not supposed to be used for all kinds of problems, or to be combinable with everything (though actors and STM could make a good deal together). In some cases, setting up a group of actors (workers + a supervisor) or explicitly splitting up a task into portions, to feed them to the fork-join pool, is too cumbersome, and it's just way handier to call .par on an existing collection you're already using, and simply traverse it in a parallel, gaining a performance benefit almost for free (in terms of setup).

All in all, actors and parallel collections are different dimensions of the problem - actors is a concurrency paradigm, whilst parallel collections is just a useful tool that should be viewed not as a concurrency alternative, but rather as an augmentation of the collections toolset.

Upvotes: 2

Related Questions