Rony Singh
Rony Singh

Reputation: 147

Printing List giving wrong size

Upon receiving employeeJobDataList data, I have created

var resultList: List[List[AnytimePayEmployeeJobData]] = new ArrayList[List[AnytimePayEmployeeJobData]]

where I am adding List of data having BATCH_THRESHOLD = 25

Here is the code snippet:

val upsertsFailedBatches: List[FailedBatch] = new ArrayList[FailedBatch]
val upsertsEmployeeJobDataList: List[AnytimePayEmployeeJobData] = new ArrayList[AnytimePayEmployeeJobData]
var resultList: List[List[AnytimePayEmployeeJobData]] = new ArrayList[List[AnytimePayEmployeeJobData]]

for (i <- 0 until employeeJobDataList.size()) {
  upsertsEmployeeJobDataList.add(employeeJobDataList.get(i))
  if (upsertsEmployeeJobDataList.size() == BATCH_THRESHOLD) {
    resultList.add(upsertsEmployeeJobDataList)
    upsertsEmployeeJobDataList.clear()
  }
}
 if (upsertsEmployeeJobDataList.size() > 0) {
  resultList.add(upsertsEmployeeJobDataList)
}

Although, I am adding List of 25 data in resultList but while checking size using print statement it is showing me 4 Here is the output: batch size: 4

 for (i <- 0 until resultList.size()){
  println("batch size: "+ resultList.get(i).size())
}

Upvotes: 0

Views: 69

Answers (1)

Mateusz Kubuszok
Mateusz Kubuszok

Reputation: 27535

The reason why this doesn't work is that you use mutable collection.

for (i <- 0 until employeeJobDataList.size()) {
  upsertsEmployeeJobDataList.add(employeeJobDataList.get(i))
  if (upsertsEmployeeJobDataList.size() == BATCH_THRESHOLD) {
    resultList.add(upsertsEmployeeJobDataList)
    upsertsEmployeeJobDataList.clear()
  }
}

When one batch is filled you:

  • add a reference of a collection to another collection
  • instead of creating a new collection for next batch, you remove all elements of it and start adding them anew

As a result you have a collection where one and the same list if repeated size() / BATCH_THRESHOLD times and its content is the last size() % BATCH_THRESHOLD items.

Instead of clear you should have created a new collection... but this whole code can be just replaced by one liner if you use Scala collections instead:

upsertsFailedBatches.grouped(BATCH_THRESHOLD).toList

Upvotes: 3

Related Questions