Sean Holdsworth
Sean Holdsworth

Reputation: 434

Mapping sort indexes

I encountered and solved this problem as part of a larger algorithm, but my solution seems inelegant and I would appreciate any insights.

I have a list of pairs which can be viewed as points on a Cartesian plane. I need to generate three lists: the sorted x values, the sorted y values, and a list which maps an index in the sorted x values with the index in the sorted y values corresponding to the y value with which it was originally paired.

A concrete example might help explain. Given the following list of points:

((3, 7), (15, 4), (7, 11), (5, 0), (4, 7), (9, 12))

The sorted list of x values would be (3, 4, 5, 7, 9, 15), and the sorted list of y values would be (0, 4, 7, 7, 11, 12).

Assuming a zero based indexing scheme, the list that maps the x list index to the index of its paired y list index would be (2, 3, 0, 4, 5, 1).

For example the value 7 appears as index 3 in the x list. The value in the mapping list at index 3 is 4, and the value at index 4 in the y list is 11, corresponding to the original pairing (7, 11).

What is the simplest way of generating this mapping list?

Upvotes: 1

Views: 1089

Answers (4)

Sean Holdsworth
Sean Holdsworth

Reputation: 434

I've just understood what j_random_hacker meant by removing a level of indirection by sorting the points in x initially. That allows things to be tidied up nicely. Thanks.

points = ((3, 7), (15, 4), (7, 11), (5, 0), (4, 7), (9, 12))

N = len(points)

ordered_by_x = sorted(points)
ordered_by_y = sorted(zip([y for (x, y) in ordered_by_x], range(N)))

index_list = N * [0]

for i, (y, k) in enumerate(ordered_by_y):
    index_list[k] = i

xs = [x for (x, y) in ordered_by_x]
ys = [y for (y, k) in ordered_by_y]

print "xs:", xs
print "ys:", ys
print "index_list:", index_list

Upvotes: 1

Sean Holdsworth
Sean Holdsworth

Reputation: 434

Thank you for the answers. For what it's worth, the solution I had was pretty similar to those outlined, but as j_random_hacker pointed out, there's no need for a map. It just struck me that this little problem seems more complicated than it appears at first glance and I was wondering if I was missing something obvious. I've rehashed my solution into Python for comparison.

points = ((3, 7), (15, 4), (7, 11), (5, 0), (4, 7), (9, 12))

N = len(points)

# Separate the points into their x and y components, tag the values with
# their index into the points list.

# Sort both resulting (value, tag) lists and then unzip them into lists of
# sorted x and y values and the tag information.

xs, s = zip(*sorted(zip([x for (x, y) in points], range(N))))
ys, r = zip(*sorted(zip([y for (x, y) in points], range(N))))

# Generate the mapping list.

t = N * [0]

for i in range(N):
    t[r[i]] = i

index_list = [t[j] for j in s]

print "xs:", xs
print "ys:", ys
print "index_list:", index_list

Output:

xs: (3, 4, 5, 7, 9, 15)
ys: (0, 4, 7, 7, 11, 12)
index_list: [2, 3, 0, 4, 5, 1]

Upvotes: 1

Kevin
Kevin

Reputation: 76204

I propose the following. Generate the unsorted x and y lists.

xs = [3, 15,  7, 5, 4, 9 ]
ys = [7,  4, 11, 0, 7, 12]

Transform each element into a tuple - the first of the pair being the coordinate, the second being the original index.

xs = [(3, 0), (15, 1), ( 7, 2), (5, 3), (4, 4), ( 9, 5)]
ys = [(7, 0), ( 4, 1), (11, 2), (0, 3), (7, 4), (12, 5)]

Sort both lists.

xs = [(3, 0), (4, 4), (5, 3), (7, 2), ( 9, 5), (15, 1)]
ys = [(0, 3), (4, 1), (7, 0), (7, 4), (11, 2), (12, 5)]

Create an array, y_positions. The nth element of the array contains the current index of the y element that was originally at index n.

Create an empty index_list. For each element of xs, get the original_index, the second pair of the tuple. Use y_positions to retrieve the current index of the y element with the given original_index. Add the current index to index_list.

Finally, remove the index values from xs and ys.

Here's a sample Python implementation.

points = ((3, 7), (15, 4), (7, 11), (5, 0), (4, 7), (9, 12))

#generate unsorted lists
xs, ys = zip(*points)

#pair each element with its index
xs = zip(xs, range(len(xs)))
ys = zip(ys, range(len(xs)))

#sort
xs.sort()
ys.sort()

#generate the y positions list.
y_positions = [None] * len(ys)
for i in range(len(ys)):
    original_index = ys[i][1]
    y_positions[original_index] = i

#generate `index_list`
index_list = []
for x, original_index in xs:
    index_list.append(y_positions[original_index])

#remove tuples from x and y lists
xs = zip(*xs)[0]
ys = zip(*ys)[0]

print "xs:", xs
print "ys:", ys
print "index list:", index_list

Output:

xs: (3, 4, 5, 7, 9, 15)
ys: (0, 4, 7, 7, 11, 12)
index list: [2, 3, 0, 4, 5, 1]

Generation of y_positions and index_list is O(n) time, so the complexity of the algorithm as a whole is dominated by the sorting step.

Upvotes: 1

j_random_hacker
j_random_hacker

Reputation: 51226

Here's a simple O(nlog n) method:

  1. Sort the pairs by their x value: ((3, 7), (4, 7), (5, 0), (7, 11), (9, 12), (15, 4))
  2. Produce a list of pairs in which the first component is the y value from the same position in the previous list and the second increases from 0: ((7, 0), (7, 1), (0, 2), (11, 3), (12, 4), (4, 5))
  3. Sort this list by its first component (y value): ((0, 2), (4, 5), (7, 0), (7, 1), (11, 3), (12, 4))
  4. Iterate through this list. For the ith such pair (y, k), set yFor[k] = i. yFor[] is your list (well, array) mapping indices in the sorted x list to indices in the sorted y list.
  5. Create the sorted x list simply by removing the 2nd element from the list produced in step 1.
  6. Create the sorted y list by doing the same with the list produced in step 3.

Upvotes: 3

Related Questions