Reputation: 65
I have a list of lists. The first element in each sublist is a chromosome eg 'chr1', 'chr5', 'chr10', 'chrX' and 'chrY'. I want to sort the sublists by chromosome number and then by X and Y. I have tried the following.
List.sort(key=lambda x: Set_Chr_Nr_(x[0]))
I am using the following def, which takes the chromosome string, removes the 'chr', converts the remainder to an int if it is a number, and asssigns a number if it is an 'X' or 'Y'.
def Set_Chr_Nr_ (Chr):
""" Sort by chromosome """
if Chr:
New = Chr[3:]
if New == 'X': New = 23
elif New == 'Y': New = 24
elif New == 'M': New = 25
else: New = int(New)
else:
New = 0
return New
But it does not return the desired sort order. Instead, I get a list that starts with sublists contiaing 'chr1' but puts sublists containing 'chr10' next, not 'chr2'. What am I doing wrong here?
Example data with column header:
Type OriginChr OriginBegin OriginEnd DestChr DestBegin DestEnd
inversion chr10 13105010 13105143 chr10 13104876 13105378
inversion chr14 87902496 87902539 chr14 87902497 87902540
Rick
Upvotes: 1
Views: 1118
Reputation: 4467
You can try,
a = ['chr1', 'chr10', 'chr5', 'chrX']
sorted(a, key=lambda x: Set_Chr_Nr_(x))
print a
If you want to use list.sort(), you can switch to,
a.sort(lambda x,y: x-y, key=lambda x: Set_Chr_Nr_(x))
For you original input, if the column is fixed, this will work,
a = [['inversion', 'chr14', 87902496, 87902539, 'chr14', 87902497, 87902540], ['inversion', 'chr10', 13105010, 13105143, 'chr10', 13104876, 13105378]]
sorted(a, key=lambda x: Set_Chr_Nr_(x[1]))
print a
Upvotes: 1