Miguel
Miguel

Reputation: 43

Sorting a list based on containing lists

I have to sort a list contianing other lists based on the value of index 0. As it may be clear seeing the example below, i have to sort 'tempdb' based on the years of each nested list, but i have no clue. I am not allowed to import a library like NumPy, etc. so just 'naked' Python code.

Example of a list:

tempdb = [
['8-8-2007', '268', 'Anonymized', 'aanname laboratoriumonderzoek', 'aug', '2007'],
['8-8-2015', '268', 'Anonymized', 'aanname laboratoriumonderzoek', 'aug', '2015'],
['30-11-2005', '268', 'Anonymized', 'aanname laboratoriumonderzoek', 'nov', '2005'],
['8-8-2006', '268', 'Anonymized', 'ordertarief', 'aug', '2006'],
['30-11-2006', '268', 'Anonymized', 'ordertarief', 'nov', '2006'],
['30-11-2003', '268', 'Anonymized', 'gammaglutamyltranspeptidase', 'nov', '2003'],
['30-11-2006', '268', 'Anonymized', 'melkzuurdehydrogenase -ldh- kinetisch', 'nov', '2006'],
['30-11-2006', '268', 'Anonymized', 'alkalische fosfatase -kinetisch-', 'nov', '2006'],
['30-11-2002', '268', 'Anonymized', 'natrium vlamfotometrisch', 'nov', '2002'],
]

This is something I found and already tried, but it did not work for me.

sort_on   = lambda pos:     lambda x: x[pos]
tempdb = sorted(tempdb,key=sort_on(1)) 

My goal was to start with the oldest year (ex. 2002) and end with the newest year (ex. 2015)

Upvotes: 1

Views: 66

Answers (3)

Mykola Zotko
Mykola Zotko

Reputation: 17824

You can use datetime module to compare dates:

from datetime import datetime

tempdb = [
['8-8-2007', '268', 'Anonymized', 'aanname laboratoriumonderzoek', 'aug', '2007'],
['8-8-2015', '268', 'Anonymized', 'aanname laboratoriumonderzoek', 'aug', '2015'],
['30-11-2005', '268', 'Anonymized', 'aanname laboratoriumonderzoek', 'nov', '2005'],
['8-8-2006', '268', 'Anonymized', 'ordertarief', 'aug', '2006'],
['30-11-2006', '268', 'Anonymized', 'ordertarief', 'nov', '2006'],
['30-11-2003', '268', 'Anonymized', 'gammaglutamyltranspeptidase', 'nov', '2003'],
['30-11-2006', '268', 'Anonymized', 'melkzuurdehydrogenase -ldh- kinetisch', 'nov', '2006'],
['30-11-2006', '268', 'Anonymized', 'alkalische fosfatase -kinetisch-', 'nov', '2006'],
['30-11-2002', '268', 'Anonymized', 'natrium vlamfotometrisch', 'nov', '2002'],
]

tempdb = sorted(tempdb, key=lambda x: datetime.strptime(x[0], '%d-%m-%Y'))

for i in tempdb: print(i)

Output:

['30-11-2002', '268', 'Anonymized', 'natrium vlamfotometrisch', 'nov', '2002']
['30-11-2003', '268', 'Anonymized', 'gammaglutamyltranspeptidase', 'nov', '2003']
['30-11-2005', '268', 'Anonymized', 'aanname laboratoriumonderzoek', 'nov', '2005']
['8-8-2006', '268', 'Anonymized', 'ordertarief', 'aug', '2006']
['30-11-2006', '268', 'Anonymized', 'ordertarief', 'nov', '2006']
['30-11-2006', '268', 'Anonymized', 'melkzuurdehydrogenase -ldh- kinetisch', 'nov', '2006']
['30-11-2006', '268', 'Anonymized', 'alkalische fosfatase -kinetisch-', 'nov', '2006']
['8-8-2007', '268', 'Anonymized', 'aanname laboratoriumonderzoek', 'aug', '2007']
['8-8-2015', '268', 'Anonymized', 'aanname laboratoriumonderzoek', 'aug', '2015']

If you want to sort in the opposite direction add reverse=True to the sorted() function.

Upvotes: 2

sam46
sam46

Reputation: 1271

assuming you want to sort by year then by the number in the middle then by the first number in the date:

sorted(tempdb, key=lambda x: tuple(map(int, x[0].split('-')[::-1])))

Upvotes: 0

Patrick Artner
Patrick Artner

Reputation: 51653

The easiest solutiopn would be to sort by the last string of each inner list .. but this will not sort monts/days correctly if a year is identical.

You can use the first element if you split it at '-', convert every number to integer and reverse the result and use that to sort:

'8-8-2015' --> [2015,8,8] 

Code:

tempdb = [
['8-8-2007', '268', 'Anonymized', 'aanname laboratoriumonderzoek', 'aug', '2007'],
['8-8-20015', '268', 'Anonymized', 'aanname laboratoriumonderzoek', 'aug', '2015'],
['30-11-2005', '268', 'Anonymized', 'aanname laboratoriumonderzoek', 'nov', '2005'],
['8-8-2006', '268', 'Anonymized', 'ordertarief', 'aug', '2006'],
['30-11-2006', '268', 'Anonymized', 'ordertarief', 'nov', '2006'],
['30-11-2003', '268', 'Anonymized', 'gammaglutamyltranspeptidase', 'nov', '2003'],
['30-11-2006', '268', 'Anonymized', 'melkzuurdehydrogenase -ldh- kinetisch', 'nov', '2006'],
['30-11-2006', '268', 'Anonymized', 'alkalische fosfatase -kinetisch-', 'nov', '2006'],
['30-11-2002', '268', 'Anonymized', 'natrium vlamfotometrisch', 'nov', '2002'],
]

s = sorted(tempdb, key = lambda x: list(map(int,reversed(x[0].split('-')))))
print(s)

Output:

[['30-11-2002', '268', 'Anonymized', 'natrium vlamfotometrisch', 'nov', '2002'], 
 ['30-11-2003', '268', 'Anonymized', 'gammaglutamyltranspeptidase', 'nov', '2003'], 
 ['30-11-2005', '268', 'Anonymized', 'aanname laboratoriumonderzoek', 'nov', '2005'], 
 ['8-8-2006', '268', 'Anonymized', 'ordertarief', 'aug', '2006'], 
 ['30-11-2006', '268', 'Anonymized', 'ordertarief', 'nov', '2006'], 
 ['30-11-2006', '268', 'Anonymized', 'melkzuurdehydrogenase -ldh- kinetisch', 'nov','2006'],
 ['30-11-2006', '268', 'Anonymized', 'alkalische fosfatase -kinetisch-', 'nov', '2006'],
 ['8-8-2007', '268', 'Anonymized', 'aanname laboratoriumonderzoek', 'aug', '2007'], 
 ['8-8-20015', '268', 'Anonymized', 'aanname laboratoriumonderzoek', 'aug', '2015']]

You still gave "invalid" data - f.e. '8-8-20015' - thats on you.

Upvotes: 1

Related Questions