Reputation: 66103
I am working with clustered data that is being generated by SciPy, and would love to order my data with a custom sort order.
Let's say that my data comes out looking like this:
leafIDs = [4,5,3,1,2]
rowHeaders = ['lorem','ipsum','dolor','sit','amet']
There is a one-to-one correspondence between the two lists, leafIDs
and rowHeaders
. Both will always be the same length. For example, the row with the header lorem
will have a leaf ID of 4
, ipsum
will have an ID of 5
and so on. Note that the leafIDs are not the order I wanted to sort them in (otherwise I can use the tried and tested method). The intended one-to-one correspondence can be visualised as follow:
+---------+------------+
| leafIDs | rowHeaders |
+---------+------------+
| 4 | lorem |
| 5 | ipsum |
| 3 | dolor |
| 1 | sit |
| 2 | amet |
+---------+------------+
Now I would like to sort these two arrays by a custom order, which is again, will always be the same length as both aforementioned lists. You can see it as a scrambled order of rowHeaders
:
rowHeaders_custom = ['amet','lorem','sit','ipsum','dolor']
The desired outcome, where leafIDs
will be sorted based on rowHeaders_custom
and its one-to-one relationship with rowHeaders
, i.e.:
# Desired outcome
leafIDs_custom = [2,4,1,5,3]
What I've tried so far: my approach currently is as follow:
leafIDs
and rowHeaders
, i.e. zippedRows = zip(leafIDs, rowHeaders)
.rowHeaders_custom
.However, I am hitting a roadblock on the second step. It would nice if there are any suggestions on how to perform this custom ordered sort. I understand I might be hitting an XY problem by attempting to order a list of tuples with another list, but my understanding of sort()
is rather limited.
Upvotes: 0
Views: 326
Reputation: 11580
I presume you have several rows to rearrange, not just one.
Here is a solution that performs the translation of the columns only once, without building a mapping for every row (tuple) to be sorted. After all, the destinations remain the same.
It marks the original position of the headers and then builds the rearranged tuples picking from such locations
leaf_lst = [(4,5,3,1,2), (1,2,3,4,5), (6,7,8,9,0)]
rowHeaders = ['lorem','ipsum','dolor','sit','amet']
rowHeaders_custom = ['amet','lorem','sit','ipsum','dolor']
old_pos = tuple(rowHeaders.index(h) for h in rowHeaders_custom)
leaf_lst_custom = [tuple(t[p] for p in old_pos) for t in leaf_lst]
print(leaf_lst_custom)
produces
[(2, 4, 1, 5, 3), (5, 1, 4, 2, 3), (0, 6, 9, 7, 8)]
Upvotes: 2
Reputation: 39223
What if you make a dict
out of the zippedRows
? I.e.
>>> dict(zip(rowHeaders, leafIDs))
{'ipsum': 5, 'sit': 1, 'lorem': 4, 'amet': 2, 'dolor': 3}
Capturing that, then:
dictRows = dict(zip(rowHeaders, leafIDs))
You could just pull the values out of that:
leafIDs_custom = [dictRows[v] for v in rowHeaders_custom]
I don't know, there might be a more pythonic way to do it, but that's the solution I'm coming up with.
Upvotes: 4