Sorting tuples based on another list

Question

I am working with clustered data that is being generated by SciPy, and would love to order my data with a custom sort order.

Let's say that my data comes out looking like this:

leafIDs = [4,5,3,1,2]
rowHeaders = ['lorem','ipsum','dolor','sit','amet']

There is a one-to-one correspondence between the two lists, leafIDs and rowHeaders. Both will always be the same length. For example, the row with the header lorem will have a leaf ID of 4, ipsum will have an ID of 5 and so on. Note that the leafIDs are not the order I wanted to sort them in (otherwise I can use the tried and tested method). The intended one-to-one correspondence can be visualised as follow:

+---------+------------+
| leafIDs | rowHeaders |
+---------+------------+
|       4 | lorem      |
|       5 | ipsum      |
|       3 | dolor      |
|       1 | sit        |
|       2 | amet       |
+---------+------------+

Now I would like to sort these two arrays by a custom order, which is again, will always be the same length as both aforementioned lists. You can see it as a scrambled order of rowHeaders:

rowHeaders_custom = ['amet','lorem','sit','ipsum','dolor']

The desired outcome, where leafIDs will be sorted based on rowHeaders_custom and its one-to-one relationship with rowHeaders, i.e.:

# Desired outcome
leafIDs_custom = [2,4,1,5,3]

What I've tried so far: my approach currently is as follow:

Zip leafIDs and rowHeaders, i.e. zippedRows = zip(leafIDs, rowHeaders).
Attempt to sort the list of tuples by the list rowHeaders_custom.

However, I am hitting a roadblock on the second step. It would nice if there are any suggestions on how to perform this custom ordered sort. I understand I might be hitting an XY problem by attempting to order a list of tuples with another list, but my understanding of sort() is rather limited.

Linus Thiel · Accepted Answer

What if you make a dict out of the zippedRows? I.e.

>>> dict(zip(rowHeaders, leafIDs))
{'ipsum': 5, 'sit': 1, 'lorem': 4, 'amet': 2, 'dolor': 3}

Capturing that, then:

dictRows = dict(zip(rowHeaders, leafIDs))

You could just pull the values out of that:

leafIDs_custom = [dictRows[v] for v in rowHeaders_custom]

I don't know, there might be a more pythonic way to do it, but that's the solution I'm coming up with.

Sorting tuples based on another list

Answers (2)

Related Questions