Cherry
Cherry

Reputation: 33544

Why scala futures do not work faster even with more threads in threads pool?

I have a following algorithm with scala:

  1. Do initial call to db to initialize cursor
  2. Get 1000 entities from db (Returns Future)
  3. For every entity process one additional request to database and get modified entity (returns future)
  4. Transform original entity
  5. Put transformed entity to Future call back from #3
  6. Wait for all Futures

In scala it some thing like:

 val client = ...
 val size = 1000
 val init:Future = client.firstSearch(size) //request over network
 val initResult = Await(init, 30.seconds)
 var cursorId:String = initResult.getCursorId
 while (!cursorId.isEmpty) {
    val futures:Seq[Future] = client.grabWithSize(cursorId).map{response=>
        response.getAllResults.map(result=>
            val grabbedOne:Future[Entity] = client.grabOneEntity(result.id) //request over network
            val resultMap:Map[String,Any] = buildMap(result)
            val transformed:Map[String,Any] = transform(resultMap) //no future here
            grabbedOne.map{grabbedOne=>
                buildMap(grabbedOne) == transformed
            }
        }
    Futures.sequence(futures).map(_=> response.getNewCursorId)
    }
 }

 def buildMap(...):Map[String,Any] //sync call

I noticed that if I increase size say two times, every iteration in while started working slowly ~1.5. But I do not see that my PC processor loaded more. It loaded near zero, but time increases in ~1.5. Why? I have setuped:

implicit val ec = ExecutionContext.fromExecutor(Executors.newFixedThreadPool(1024))

I think, that not all Futures executed in parallel. But why? And ho to fix?

Upvotes: 1

Views: 173

Answers (1)

bjfletcher
bjfletcher

Reputation: 11508

I see that in your code, the Futures don't block each other. It's more likely the database that is the bottleneck.

Is it possible to do a SQL join for O(1) rather than O(n) in terms of database calls? (If you're using Slick, have a look under the queries section about joins.)

If the load is low, it's probably that the connection pool is maxed out, you'd need to increase it for the database and the network.

Upvotes: 1

Related Questions