Reputation: 314
I have a 2D list:
ls = [
['-2,60233106656288100', '2', 'C'],
['-9,60233106656288100', '2', 'E'],
['-4,60233106656288100', '2', 'E'],
['-3,60233106656288100', '2', 'C'],
['-5,60233106656288100', '4', 'T'],
['-0,39019660724115224', '3', 'E'],
['-3,60233106656288100', '2', 'T'],
['-6,01086748514074000', '1', 'Q'],
['-5,02684650459461800', '0', 'X'],
['-1,25228509312138300', 'A', 'N'],
['-0,85517128843547330', '3', 'E'],
['1,837508975733196200', '3', '-', 'E'],
['1,850925075915637700', '5', '-', 'T'],
['1,826767133229081000', '4', '-', 'C'],
['1,845357865328532300', '3', '-', 'E'],
['0,636275318914609100', 'a', 'n', 'N']
]
I want to sort it first so that the shorter sublists are sorted according to the second column and after that according to the third column so that the list stays sorted according to the second column (first row has 0 in the second column, then 1, then five twos etc. but the twos switch places so that I first have two E's and then two C's and then T). After that I want to sort the longer sublists according to the fourth column. The row where I have A
should be the last one of the shorter lists and the row where I have a
should be the last row. So the output should be as follows:
[
['-5,02684650459461800', '0', 'X'],
['-6,01086748514074000', '1', 'Q'],
['-9,60233106656288100', '2', 'E'],
['-4,60233106656288100', '2', 'E'],
['-3,60233106656288100', '2', 'C'],
['-2,60233106656288100', '2', 'C'],
['-3,60233106656288100', '2', 'T'],
['-0,39019660724115224', '3', 'E'],
['-0,85517128843547330', '3', 'E'],
['-5,60233106656288100', '4', 'T'],
['-1,25228509312138300', 'A', 'N'],
['1,837508975733196200', '3', '-', 'E'],
['1,845357865328532300', '3', '-', 'E'],
['1,826767133229081000', '4', '-', 'C'],
['1,850925075915637700', '5', '-', 'T'],
['0,636275318914609100', 'a', 'n', 'N']
]
I know that I can sort according to the second column as:
ls.sort(key=lambda x:x[1])
But this sorts the whole list and gives:
['-5,02684650459461800', '0', 'X']
['-6,01086748514074000', '1', 'Q']
['-2,60233106656288100', '2', 'C']
['-9,60233106656288100', '2', 'E']
['-4,60233106656288100', '2', 'E']
['-3,60233106656288100', '2', 'C']
['-3,60233106656288100', '2', 'T']
['-0,39019660724115224', '3', 'E']
['-0,85517128843547330', '3', 'E']
['1,837508975733196200', '3', '-', 'E']
['1,845357865328532300', '3', '-', 'E']
['-5,60233106656288100', '4', 'T']
['1,826767133229081000', '4', '-', 'C']
['1,850925075915637700', '5', '-', 'T']
['-1,25228509312138300', 'A', 'N']
['0,636275318914609100', 'a', 'n', 'N']
How can I implement the sorting so that I can choose a certain portion of the list and then sort it and after that sort it again according to other column?
Upvotes: 0
Views: 91
Reputation: 82899
If I understand you correctly, you want to sort the list
len
of the sublists,For this, you can use a tuple
as the search key, using the len
and a slice of the sublist starting at the second element (i.e. at index 1
):
ls.sort(key=lambda x: (len(x), x[1:]))
Note that this will also use elements after the fourth as further tie-breakers, which might not be wanted. Also this creates temporary (near) copies of all the sublists, which may be prohibitive if the lists are longer, even if all comparisons may be decided after the 3rd or 4th element.
Alternatively, if you only need the first four, or ten, or whatever number of elements, you can create a closed slice and used that to compare:
ls.sort(key=lambda x: (len(x), x[1:4]))
Since out-of-bounds slices are evaluated as empty lists, this works even if the lists have fewer elements than either the start- or end-index.
Upvotes: 2
Reputation: 31319
How about:
ls.sort(key=lambda x: (l := len(x), x[1], '' if l < 4 else x[3]))
That would sort it by length of the sublist first, then by the 2nd column and finally by the 4th column, if there is one (picking ''
in case there isn't, which would still sort it all the way to the top).
Upvotes: 2