Reputation: 7815
sorry for the obtuse sounding question but I'm trying to help a friend by writing a script to substitute some references into a correct format. so what I will write is a function (some_func) that will take in a bunch of numbers (the keys to the dictionary which I will print in a second) and return a list of the strings ordered by date
somefunc(num1, num2, num3,(...))
will import my dictionary (which I have populated by using regexes to lift some html into a nice dict format)
{'1': ' Bauer et al. (2000). ', '2': 'G. M. Kirwan in litt. (1999). ', '5': ' Scott (1997). ', '4': ' Pacheco (1999). ', '7': ' Venturini et al. (2005). ', '6': ' Venturini et al. (2002). ', '8': 'P. Develey in litt. (2007, 2008). '}
and if given for example (1,2,7) will return ['G. M. Kirwan in litt. (1999). ', ' Bauer et al. (2000). ', ' Venturini et al. (2005). ']
I was planning on using some regexes to search for a date string and then ordering them like that but I feel there's a better way. I also need the function to be able to take an unknown number of inputs and I am slightly unsure on how to accomplish this, if anyone wants to really blitz this question they could tell me how to order by months if there was a case of the year being the same (imagine the references were of the form 'G. M. Kirwan in litt. Jan (1999). ' etc.)
Thanks for reading, sorry about the sloppiness of the question but the datas somewhat unstructured and I've had to mess around a bit just to get it into this format.
Upvotes: 0
Views: 173
Reputation: 18633
Something like this?
>>> import re
>>> def get_year(citation):
... citation = citation.strip()
... year = re.search(r"\((\d{4}).*\)\.$", citation).group(1)
... return int(year)
>>> test_list = ['Bauer et al. (2000).', 'G. M. Kirwan in litt. (1999).', 'Pacheco (1999).', 'Scott (1997).', 'Venturini et al.(2002).', 'Venturini et al. (2005).', 'P. Develey in litt. (2007, 2008).']
>>> test_list
['Bauer et al. (2000).', 'G. M. Kirwan in litt. (1999).', 'Pacheco (1999).', 'Scott (1997).', 'Venturini et al. (2002).', 'Venturini et al. (2005).', 'P. Develey in litt. (2007, 2008).']
>>> test_list.sort(key = get_year)
>>> test_list
['Scott (1997).', 'G. M. Kirwan in litt. (1999).', 'Pacheco (1999).', 'Bauer et al. (2000).', 'Venturini et al. (2002).', 'Venturini et al. (2005).', 'P. Develey in litt. (2007, 2008).']
(Regex masters, I still have a lot to learn when it comes to regexes so please let me know if my regex is weak).
Upvotes: 3