Kartik Aiyer
Kartik Aiyer

Reputation: 591

Need help figuring out performance bottleneck

I'm working on my scala chops and implemented a small graph Api to track vertices and edges added to graph. I have basic GraphLike Trait, and have an Undirected Graph class ( UnDiGraph) and a Directed graph class (DiGraph ) that extend the GraphLike trait. Here is some of the listing

trait GraphLike[T] {
    val vertices: Map[T, VertexLike[T]]
    def addEdge( a:T, b:T ): GraphLike[T]
    def addVertex( t:T ): GraphLike[T]
    def addVertex( vert: VertexLike[T] ): GraphLike[T]
    def adjacency( t:T ): Option[ List[T] ]  =
    {
      if ( vertices contains t )
        Some( vertices(t).adjList )
      else
        None
    }
    def vertNum: Integer = vertices size
    def edgeNum: Integer = 
    {
      def summer( sum: Integer, ms: Map[T, VertexLike[T] ] ): Integer =
      {
        if ( ms.isEmpty )
          sum
        else
          summer( sum + ms.head._2.adjList.size, ms.tail )
      }
      summer( 0, vertices )
    }
    def getVertex( t: T ): VertexLike[T] =
    {
      vertices( t )
    }
    def edgeExists( a:T, b:T ): Boolean =
    {
      try
      {
          if( vertices( a ).adjList contains b )
            true
          else
            false
      }catch {
        case ex: NoSuchElementException => false 
      }
    }
}

Heres what the Directe Graph Looks like.

class DiGraph[T](val vertices: Map[ T, VertexLike[ T ] ] = Map.empty ) extends GraphLike[T] {
    def makeVertex( t:T ): VertexLike[T] = new Vertex( t )

    def addEdge( a:T, b:T ): GraphLike[T] =
    {
      //Make sure vertices exist
      if( edgeExists(a, b) )
        this
      else {
        try {
          vertices(b)
          vertices(a)
        } catch {
          case ex: NoSuchElementException => println("Vertices not Found"); this
        }
        addVertex( vertices( a ) + b )
      }
    }
    def addVertex( t:T ): DiGraph[T] = 
    {
      if( vertices contains t ) this
      else
      new DiGraph[T]( vertices + ( t -> makeVertex(t) ) )
    }
    def addVertex( vert: VertexLike[T] ): DiGraph[T] =
    {
      new DiGraph[T]( vertices + ( vert.apply -> vert ) ) 
    }
}

Vertices are stored in a Map going from type T to VertexLike[T]. Vertex Like basically holds an adjacency list for the specific Vertex. Heres what VertexLike looks like:

trait VertexLike[T] 
{
  def addEdgeTo( dest: T ): VertexLike[T]
  def adjList: List[T]
  def +( dest: T) = addEdgeTo(dest)
  def apply: T
} 

class Vertex[T](t: T, adj: List[T] = List() ) extends VertexLike[T]
{
    def apply() = t
    def adjList = adj
    def addEdgeTo( dest: T ) = 
      if( adjList contains dest )
        this
      else
        new Vertex[T]( t, dest :: adjList )
}

( Yes... i realize the apply method in the class is useless and it only works on objects. Realized that a little later ).

Anyways, I have a sample graph where I have about 80,000 vertices. Adding the vertices to the Graph is taking just way too long. I tried to do things functionally and in an immutable way. Whenever you add a vertex or an edge to a graph, you get a new graph ( I tried to make sure the constuctors of the graph types weren't doing much ). This is the client code that I use to create my graph from my data.

def GraphInstantiater: GraphLike[Int] =
{
  println( "Total number of Vertices: " + synMap.keys.size )
  def vertexAdder( ls: Iterable[Int], graph:GraphLike[Int] ): GraphLike[Int] = 
    if( ls.isEmpty) graph else vertexAdder( ls.tail, graph.addVertex( ls.head ) ) 

  val gr = vertexAdder( synMap.keys, new DiGraph[Int]( Map() ) )
  println( "Vertices added. Total: %d".format( gr.vertices.size ) )
  gr
}

I know constructing new graphs will take cycles but is it really all that great given that I'm not doing much in the constructors. Would repeatedly creating the Map of vertices keep causing problems ( its one of the parameters of the graph class ). Any ideas on what the bottlenecks are in this method would be much appreciated. Also if you need any additional information, please let me know.

Upvotes: 1

Views: 147

Answers (2)

gourlaysama
gourlaysama

Reputation: 11290

As a complement to you answer: you indeed inadvertently traverse the whole synMap.keys every time you call ls.tail.

What happens is:

  • Map.key returns the value of Map.keySet, which is a custom immutable Set.
  • that Set overrides a few things, but leaves tail and drop to their default implementation. Its tail implementation (from TraversableLike) just calls drop.
  • And that's where everything falls apart: it gets its implementation of drop from IterableLike, and that only does what you can do with an Iterable: iterate. So a new builder is created, the head of the iterator is dropped, then the iterator is added to the builder, which traverses all your keys, and a new collection (the tail) is returned.

You can probably avoid the conversion to a list altogether by using an iterator, with something like:

def vertexAdder( ls: Iterator[Int], graph:GraphLike[Int] ): GraphLike[Int] = {
  if(!ls.hasNext) 
    graph 
  else
    val h = ls.next
    vertexAdder( ls, graph.addVertex(h) ) 
}

and then:

val gr = vertexAdder( synMap.keysIterator, new DiGraph[Int]( Map() ) )

As a side note, it is a bit sad that Set doesn't provides its own version of tail. It could maybe just takes the head of its own iterator and returns itself minus that element.

Upvotes: 1

Kartik Aiyer
Kartik Aiyer

Reputation: 591

Oh wow... I figured out what was going on. In the GraphInstantiater method, the very first call which passes synMap.keys, keys returns an iterable[Int]. Looks like taking tail on this is a long process, most likely going through the whole set of keys each time.

changing the call to

val gr = vertexAdder( synMap.keys.toList, new DiGraph[Int]( Map() ) )

made everything go faster. Does anyone know what is the underlying implementation of the container returned when you call keys on a Map ?

Upvotes: 0

Related Questions