Mike H-R
Mike H-R

Reputation: 7815

How do I sort a dictionary of results into a list of strings by date order of part of the strings (in python)

sorry for the obtuse sounding question but I'm trying to help a friend by writing a script to substitute some references into a correct format. so what I will write is a function (some_func) that will take in a bunch of numbers (the keys to the dictionary which I will print in a second) and return a list of the strings ordered by date

somefunc(num1, num2, num3,(...))

will import my dictionary (which I have populated by using regexes to lift some html into a nice dict format)

{'1': ' Bauer et al. (2000). ', '2': 'G. M. Kirwan in litt. (1999). ', '5': ' Scott (1997). ', '4': ' Pacheco (1999). ', '7': ' Venturini et al. (2005). ', '6': ' Venturini et al. (2002). ', '8': 'P. Develey in litt. (2007, 2008). '}

and if given for example (1,2,7) will return ['G. M. Kirwan in litt. (1999). ', ' Bauer et al. (2000). ', ' Venturini et al. (2005). ']

I was planning on using some regexes to search for a date string and then ordering them like that but I feel there's a better way. I also need the function to be able to take an unknown number of inputs and I am slightly unsure on how to accomplish this, if anyone wants to really blitz this question they could tell me how to order by months if there was a case of the year being the same (imagine the references were of the form 'G. M. Kirwan in litt. Jan (1999). ' etc.)

Thanks for reading, sorry about the sloppiness of the question but the datas somewhat unstructured and I've had to mess around a bit just to get it into this format.

Upvotes: 0

Views: 173

Answers (1)

Nolen Royalty
Nolen Royalty

Reputation: 18633

Something like this?

>>> import re
>>> def get_year(citation):
...     citation = citation.strip()
...     year = re.search(r"\((\d{4}).*\)\.$", citation).group(1)
...     return int(year)
>>> test_list = ['Bauer et al. (2000).', 'G. M. Kirwan in litt. (1999).', 'Pacheco (1999).', 'Scott (1997).', 'Venturini et al.(2002).', 'Venturini et al. (2005).', 'P. Develey in litt. (2007, 2008).']
>>> test_list
['Bauer et al. (2000).', 'G. M. Kirwan in litt. (1999).', 'Pacheco (1999).', 'Scott (1997).', 'Venturini et al. (2002).', 'Venturini et al. (2005).', 'P. Develey in litt. (2007, 2008).']
>>> test_list.sort(key = get_year)
>>> test_list
['Scott (1997).', 'G. M. Kirwan in litt. (1999).', 'Pacheco (1999).', 'Bauer et al. (2000).', 'Venturini et al. (2002).', 'Venturini et al. (2005).', 'P. Develey in litt. (2007, 2008).']

(Regex masters, I still have a lot to learn when it comes to regexes so please let me know if my regex is weak).

Upvotes: 3

Related Questions