Reputation: 6860
I am looking at the following interview question :
Given 2d coordinates , find the k points which are closest to the origin. Propose a data structure for storing the points and the method to get the k points. Also point out the complexity of the code.
The solution that I have figured out is to save the 2d points in an array. For the first k points, find the distance of each point from the origin and build a max heap. For the remaining points , calculate distance from the origin , say dist. If dist is greater than the topmost element of the heap, then change topmost element of heap to dist and run the heapify()
procedure.
This would take O(k)
to build the heap and O((n-k)log k)
for heapify()
procedure , thus the total complexity = O(n log k)
.
Can anyone suggest a better data structure and/or method , with a possibly better efficiency too ?
EDIT
Would some other data structure be beneficial here ?
Upvotes: 4
Views: 4198
Reputation: 432
I write a simple version for you using so-called 'partial sorting' http://tzutalin.blogspot.sg/2017/02/interview-type-questions-minqueue.html
public static void main(String[] args) {
Point[] points = new Point[7];
points[0] = new Point(0, 0);
points[1] = new Point(1, 7);
points[2] = new Point(2, 2);
points[3] = new Point(2, 2);
points[4] = new Point(3, 2);
points[5] = new Point(1, 4);
points[6] = new Point(1, 1);
int k = 3;
qSelect(points, k - 1);
for (int i = 0; i < k; i++) {
System.out.println("" + points[i].x + "," + points[i].y);
}
// Output will be
// 0,0
// 1,1
// 2,2
}
// in-place qselect and zero-based
static void qSelect(Point[] points, int k) {
int l = 0;
int h = points.length - 1;
while (l <= h) {
int partionInd = partition(l, h, points);
if (partionInd == k) {
return;
} else if (partionInd < k) {
l = partionInd + 1;
} else {
h = partionInd - 1;
}
}
}
static int partition(int l, int h, Point[] points) {
// Random can be better
// int p = l + new Random.nextInt(h - l + 1);
int p = l + (h - l) / 2;
int ind = l;
swap(p, h, points);
Point comparePoint = points[h];
for (int i = l; i < h; i++) {
if (points[i].getDistFromCenter() < comparePoint.getDistFromCenter()) {
swap(i, ind, points);
ind++;
}
}
swap(ind, h, points);
return ind;
}
static void swap(int i, int j, Point[] points) {
Point temp = points[i];
points[i] = points[j];
points[j] = temp;
}
Upvotes: 1
Reputation: 41137
What you're looking for is partial sorting.
I think the best way is to put everything into an unsorted array and then do use a modified in-place quicksort which ignores partitions whose indices are entirely above or entirely below k
, and use distance from origin as your comparison.
Pseudocode from the wikipedia article above:
function quickfindFirstK(list, left, right, k)
if right > left
select pivotIndex between left and right
pivotNewIndex := partition(list, left, right, pivotIndex)
if pivotNewIndex > left + k // new condition
quickfindFirstK(list, left, pivotNewIndex-1, k)
if pivotNewIndex < left + k
quickfindFirstK(list, pivotNewIndex+1, right, k+left-pivotNewIndex-1)
After execution, this will leave the smallest k
items in the first k
positions, but not in order.
Upvotes: 3
Reputation: 3414
I would use order statistics for this one.
Note that we use a modified SELECT
that uses distance from the origin as the comparison function.
A
, first element is A[1]
and last is A[n]
.SELECT(A,1,n,k)
to find the k
th closest element to the origin.A[1..k]
.One of the benefits of SELECT
that it partitions the input,
so that the smallest k-1
elements are left to A[k]
.
So storing the elements in an array is O(n)
.
Running SELECT
is O(n)
.
Returning the requested elements is O(1)
.
Upvotes: 1