Reputation:
Problem Statement:-
I was ask this interview question recently.. I was able to come up with the below code only which runs in O(k log n)-
Given k <= n sorted arrays each of size n, there exists a data structure requiring O(kn) preprocessing time and memory that answers iterated search queries in O(k + log n) time.
I have k sorted Lists, each of size n. Currently I have hard coded 5 sorted Lists each of size 3 but in general that can be very high number-
I would like to search for single element in each of the k Lists.
Obviously, I can binary search each array individually, which will result in O(k log n) where k is number of sorted arrays.
Can we do it in O(k + log n) where k is the number of sorted arrays? As I think there might be some better way of doing it as we're doing the same searches k times as of now -
private List<List<Integer>> dataInput;
public SearchItem(final List<List<Integer>> inputs) {
dataInput = new ArrayList<List<Integer>>();
for (List<Integer> input : inputs) {
dataInput.add(new ArrayList<Integer>(input));
}
}
public List<Integer> getItem(final Integer x) {
List<Integer> outputs = new ArrayList<Integer>();
for (List<Integer> data : dataInput) {
int i = Collections.binarySearch(data, x); // binary searching the item
if (i < 0)
i = -(i + 1);
outputs.add(i == data.size() ? null : data.get(i));
}
return outputs;
}
public static void main(String[] args) {
List<List<Integer>> lists = new ArrayList<List<Integer>>();
List<Integer> list1 = new ArrayList<Integer>(Arrays.asList(3, 4, 6));
List<Integer> list2 = new ArrayList<Integer>(Arrays.asList(1, 2, 3));
List<Integer> list3 = new ArrayList<Integer>(Arrays.asList(2, 3, 6));
List<Integer> list4 = new ArrayList<Integer>(Arrays.asList(1, 2, 3));
List<Integer> list5 = new ArrayList<Integer>(Arrays.asList(4, 8, 13));
lists.add(list1);
lists.add(list2);
lists.add(list3);
lists.add(list4);
lists.add(list5);
SearchItem search = new SearchItem(lists);
System.out.println(dataInput);
List<Integer> dataOuput = search.getItem(5);
System.out.println(dataOuput);
}
Whatever output I am seeing with my above code approach should come with the new approach as well which should work in O(k + log n)
.
Is this possible to achieve? Can anyone provide an example how would this work basis on my example?
Upvotes: 5
Views: 761
Reputation: 95308
The technique is called Fractional cascading which sounds very cool. What you do is the following:
The resulting interleaving will look something like this:
(Source: "You could have invented fractional cascading" by Edward Z. Yang)
and every list element will have a couple of pointers to find predecessors/successors of a certain kind fast and to find the position in the list i - 1
.
Turns out the total number of list elements is only increased by a constant factor, but the cool thing is that you can now do queries fast:
O(log n)
. You now found the element in the original list k, because you can find in O(1) the surrounding elements that were originally in list k.O(1)
because you have pointers to the successor/predecessor in list k - 1. So you can report the result for all the other lists in O(1)
eachTotal runtime: O(log n + k)
For more information, you should definitely read the blog post, it has lots of visualizing illustrations and additional explanations.
Upvotes: 4
Reputation: 1427
Someone else has probably answered this by now (I haven't refreshed the page). But here is a method for merging the list that should work in O(hn). I haven't actually tested the syntax in an editor, but I think the idea should work...
After calling this method you should just be able to do a binary search on the merged list.
public static List<Integer> mergeSortedLists(List<List<Integer>> sortedLists){
List<Integer> mergedList = new List<Integer>();
int listIndexes[] = new int[sortedLists.size];
//initialize indexes to 0
for(int i=0; i<sortedLists.Count(); i++){
listIndex[i] = 0;
}
int completedLists=0;
int lowestValue;
int lowestIndex;
while(completedLists < sortedLists.Count()){
lowestValue = sortedLists[0][listIndexes[0]];
lowestIndex = 0;
for(int i=0; i<sortedLists.Count(); i++){
int currentIndex = listIndexes[i];
List<Integer> currentList = sortedLists[i];
if(currentIndex >= currentList) continue; //already finished merging this list skip
int currentValue = currentList[currentIndex];
if(currentValue < lowestValue){
lowestValue = currentValue;
lowestIndex = currentIndex;
}
}
//put the lowest found value into mergedList and increment index
mergedList.Add(lowestValue);
listIndexes[lowestIndex]++;
//if incremented index is equal to increment completed Lists - when all lists are marked
//complete the while loop will be broken out of and merge should be complete
if(listIndexes[lowestIndex] == sortedLists[lowestIndex].Count()){
completedLists++;
}
}
return mergedList;
}
Upvotes: 0
Reputation: 3119
Since your arrays are sorted, the elements are comparable. Using a B-tree structure, and make sure the arrays doesn't have overlapping segments i.e. each array are sorted and any item inside is either
item < first for all the other arrays; or item > last for all the other arrays.
Then O(k + logn) is achieved by comparing the searching item so that first < search item < last; then do a log(n) search inside.
But essentially this can be O(logk + logn).
Upvotes: 0