user1443368
user1443368

Reputation: 131

Sorting dictionary list-values based on time

I'm pretty new to python (couple weeks into it) and I'm having some trouble wrapping my head around data structures. What I've done so far is extract text line-by-line from a .txt file and store them into a dictionary with the key as animal, for example.

database = {
    'dog': ['apple', 'dog', '2012-06-12-08-12-59'],
    'cat': [
        ['orange', 'cat', '2012-06-11-18-33-12'],
        ['blue', 'cat', '2012-06-13-03-23-48']
    ],
    'frog': ['kiwi', 'frog', '2012-06-12-17-12-44'],
    'cow': [
        ['pear', 'ant', '2012-06-12-14-02-30'],
        ['plum', 'cow', '2012-06-12-23-27-14']
    ]
} 

# year-month-day-hour-min-sec                                       

That way, when I print my dictionary out, it prints out by animal types, and the newest dates first.

Whats the best way to go about sorting this data by time? I'm on python 2.7. What I'm thinking is

for each key:

grab the list (or list of lists) --> get the 3rd entry --> '-'.split it, --> then maybe try the sorted(parameters)

I'm just not really sure how to go about this...

Upvotes: 2

Views: 2823

Answers (3)

chees
chees

Reputation: 595

Firstly, you'll probably want each key,value item in the dict to be of a similar type. At the moment some of them (eg: database['dog'] ) are a list of strings (a line) and some (eg: database['cat']) are a list of lines. If you get them all into list of lines format (even if there's only one item in the list of lines) it will be much easier.

Then, one (old) way would be to make a comparison function for those lines. This will be easy since your dates are already in a format that's directly (string) comparable. To compare two lines, you want to compare the 3rd (2nd index) item in them:

def compare_line_by_date(x,y):
    return cmp(x[2],y[2])

Finally you can get the lines for a particular key sorted by telling the sorted builtin to use your compare_line_by_date function:

sorted(database['cat'],compare_line_by_date)

The above is suitable (but slow, and will disappear in python 3) for arbitrarily complex comparison/sorting functions. There are other ways to do your particular sort, for example by using the key parameter of sorted:

def key_for_line(line):
    return line[2]

sorted(database['cat'],key=key_for_line)

Using keys for sorting is much faster than cmp because the key function only needs to be run once per item in the list to be sorted, instead of every time items in the list are compared (which is usually much more often than the number of items in the list). The idea of a key is to basically boil each list item down into something that be compared naturally, like a string or a number. In the example above we boiled the line down into just the date, which is then compared.

Disclaimer: I haven't tested any of the code in this answer... but it should work!

Upvotes: 1

cheeken
cheeken

Reputation: 34655

Walk through the elements of your dictionary. For each value, run sorted on your list of lists, and tell the sorting algorithm to use the third field of the list as the "key" element. This key element is what is used to compare values to other elements in the list in order to ascertain sort order. To tell sorted which element of your lists to sort with, use operator.itemgetter to specify the third element.

Since your timestamps are rigidly structured and each character in the timestamp is more temporally significant than the next one, you can sort them naturally, like strings - you don't need to convert them to times.

# Dictionary stored in d
from operator import itemgetter
# Iterate over the elements of the dictionary; below, by
# calling items(), k gets the key value of an entry and 
# v gets the value of that entry
for k,v in d.items():
    if v and isinstance(v[0], list):
        v.sort(key=itemgetter(2)) # Start with 0, so third element is 2

Upvotes: 4

iMom0
iMom0

Reputation: 12921

If your dates are all in the format year-month-day-hour-min-sec,2012-06-12-23-27-14,I think your step of split it is not necessary,just compare them as string.

>>> '2012-06-12-23-27-14' > '2012-06-12-14-02-30'                              
True 

Upvotes: 3

Related Questions