chandras
chandras

Reputation: 94

Search from a range of integers

I need to do a look up for an integer from a list of integers. I sort them and use the lower_bound to find the range the given integer falls in. This takes O(lgn). Is there any way I can do better than this?

The following are the hints to improve.

  1. Given list is always positive integers
  2. List is fixed. No insert or delete.

One way is to create an array and index in to the array . This may not be space efficient. Can I use unordered_map? what hash function should I define?

// Sort in reverse order to aid the lookup process
vector<unsigned int> sortedByRange;
//... sortedByRange.push_back(..)
sort(sortedByRange.begin(), sortedByRange.end(), greater);
Range = (sortedByAddress_.begin() - sortedByRange.end();
std::cout<<"Range :"<<Range<<std::endl;    //prints 3330203948

std::pair<unsigned int, unsigned int> lookup(unsigned int addr){
    pair<unsigned int, unsigned int> result;
    vector<unsigned int>::iterator it = lower_bound(sortedByRange.begin(), 
                                           sortedByRange.end(), addr);
    result.first = *it;
    result.second = *(it++);
    return result;
}      

Upvotes: 1

Views: 727

Answers (3)

vantaku ramesh
vantaku ramesh

Reputation: 39

Javascript

let searchRangeInterger = function(nums, target) {
  let res = [-1, -1];
  let leftSide = find(nums, target, true);
  let rightSide = find(nums, target, false);
  if (!nums.length) return res;
  if (leftSide > rightSide) return res;
  return [leftSide, rightSide];
};

let find = function (nums, target, findLeft) {
  var left = 0;
  var right = nums.length - 1;
  var mid = 0;

  while (left <= right) {
    mid = Math.floor((left + right) / 2);
    if (nums[mid] > target || (findLeft && nums[mid] === target)) {
      right = mid - 1;
    } else {
      left = mid + 1;
    }
  }

  return findLeft ? left : right;
};

Upvotes: 0

rici
rici

Reputation: 241711

If the total range is not huge, you could build a sampled index array of any convenient size (how much RAM do you want to throw at it?)

So, for example, if the total range of the data is 256M, and you have a spare megabyte, then you store the positions of every 1K interval of the data range. Then for any given data point, you do a O(1) (actually O(2) :) ) probe into the index array to find the lowest and highest plausible ranges for that data point, and then you can do lowest_bound on just that range. If your ranges are not wildly variable in size, that should give you average constant time lookup.

If you don't want to throw that much memory at the problem, you could try a pair of linear estimates based on the average range size and a fuzz factor. If that turns out not to contain a particular datapoint, you can fall back to a full binary search; otherwise, again, a binary search inside the restricted range should be average linear time.

Here's the first suggestion, in case the handwaving wasn't clear enough. Totally untested code, didn't even try compiling it, and the use of integer types is, too say the least, sloppy. If you use it, try to make it more beautiful. Also I should have (but didn't) restrict the start of the indexed range to *begin_; if that's significantly greater than 0, you should fix it.

// The provided range must be sorted, and value_type must be arithmetic.
template<type RandomIterator, unsigned long size>
class IndexedLookup {
 public:
  using value_type = typename RandomIterator::value_type;
  IndexedLookup(RandomIterator begin, RandomIterator end)
    : begin_(begin),
      end_(end),
      delta_(*(end_ - 1) / size) {
    for (unsigned long i = 0; i < size; ++i)
      index_[i] = std::lower_bound(begin_, end_, i * delta_) - begin_;
      // The above expression cannot be out of range
    index_[size] = end_ - begin_;
  }

  RandomIterator lookup(value_type needle) {
    int low = needle / delta_;
    return std::lower_bound(index_[begin_ + low],
                            index_[begin_ + low + 1],
                            needle);
  }

 private:
  RandomIterator begin_, end_;
  value_type delta_;
  std::array<int, size + 1> index_;
}    

Upvotes: 1

FooF
FooF

Reputation: 4462

Method 1: If you just need to know if a given number is in the list, and the maximum value is not too big, you could consider using bit field. The look up would be O(1) operation then.

Method 2: If the range of values is huge (small and big integers in it), but the list size is not big (e.g. couple of thousands), you could try (programmatically) crafting a hash function that

  1. is one-to-one on the values of in the list;
  2. will give a value of range 0 ... N + m with m small enough;
  3. relatively inexpensive to calculate.

The values of the constant list could then be put in an array, indexed by the hash value, for quick check of inclusion for a given input value. If there are holes in the list (m non-zero), then the holes should be indicated by special value (e.g. -1).

Inclusion test: for a given input 1. calculate hash value; 2. if the value of the hash value is out of range, the input is not in the list; 3. otherwise the input belongs to the list if and only if the value in the generated array indexed by the hash value is the same as the input value.

How to craft the hash function is worth another question in SO (for string values there exist tools to generate tools for this purpose). :-)

Limitation: If the list is not created in compile time, but calculated or received in run time of the program, then this method is not suitable. Also if this list is changing frequently, then the computational time required to generate the hash function and the code could make this approach unsuitable.

Upvotes: 0

Related Questions