Reputation: 1389
I have a list of integers and I need to find out the range it falls in. I have a list of ranges which might be of size 2 to 15 at the maximum. Currently for every integer,I check through the list of ranges and find its location. But this takes a lot of time as the list of integers I needed to check includes few thousands.
//list of integers
val numList : List[(Int,Int)] = List((1,4),(6,20),(8,15),(9,15),(23,27),(21,25))
//list of ranges
val rangesList:List[(Int,Int)] = List((1,5),(5,10),(15,30))
def checkRegions(numPos:(Int,Int),posList:List[(Int,Int)]){
val loop = new Breaks()
loop.breakable {
for (va <- 0 until posList.length) {
if (numPos._1 >= posList(va)._1 && numPos._2 <= (posList(va)._2)) {
//i save "va"
loop.break()
}
}
}
}
Currently for every integer in numList
I go through the rangesList
to find its range and save its range location. Is there any faster/better way approach to this issue?
Update: It's actually a list of tuples that is compared against a list of ranges.
Upvotes: 1
Views: 597
Reputation: 20415
One approach includes the use of parallel collections with par
, and also indexWhere
which delivers the index of the first item in a collection that holds a condition.
For readability consider this predicate for checking interval inclusion,
def isIn( n: (Int,Int), r: (Int,Int) ) = (r._1 <= n._1 && n._2 <= r._2)
Thus,
val indexes = numList.par.map {n => rangesList.indexWhere(r => isIn(n,r))}
indexes: ParVector(0, -1, -1, -1, 2, 2)
delivers, for each number, the index in the ranges collection where it is included. Value -1
indicates the condition did not hold.
For associating numbers with range indexes, consider this,
numList zip indexes
res: List(((1,4), 0), ((6,20),-1), ((8,15),-1),
((9,15),-1), ((23,27),2), ((21,25),2))
Parallel collections may prove more efficient that the non parallel counterpart for performing computations on a very large number of items.
Upvotes: 2
Reputation: 5768
First of all, using apply
on a List
is problematic, since it takes linear run time.
List(1,2,3)(2)
has to traverse the whole list to finally get the last element at index 2.
If you want your code to be efficient, you should either find a way around it or choose another data structure. Data structures like IndexedSeq
have constant time indexing.
You should also avoid breaks
, as to my knowledge, it works via exceptions and that is not a good practice. There are always ways around it.
You can do something like this:
val numList : List[(Int,Int)] = List((1,4),(6,20),(8,15),(9,15),(23,27),(21,25))
val rangeList:List[(Int,Int)] = List((1,5),(5,10),(15,30))
def getRegions(numList: List[(Int,Int)], rangeList:List[(Int,Int)]) = {
val indexedRangeList = rangeList.zipWithIndex
numList.map{case (a,b) => indexedRangeList
.find{case ((min, max), index) => a >= min && b <= max}.fold(-1)(_._2)}
}
And use it like this:
getRegions(numList, rangeList)
//yields List(0, -1, -1, -1, 2, 2)
I chose to yield -1 when no range matches. The key thing is that you zip the ranges with the index beforehand. Therefore we know at each range, what index this range has and are never using apply
.
If you use this method to get the indices to again access the ranges in rangeList via apply
, you should consider changing to IndexedSeq
.
The apply
will of course only be costly when the number ranges gets big. If, as you mentioned, it is only 2-15, then it is no problem. I just want to give you the general idea.
Upvotes: 2