Vincent
Vincent

Reputation: 60381

Sort by grouping using first occurence in python?

In python, I have a list of classes that have a string field called category. Let's consider the following example:

mylist[0].category # = "furniture"
mylist[1].category # = "car"
mylist[2].category # = "fruit"
mylist[3].category # = "car"
mylist[4].category # = "furniture"

My question is: how to reorder the list by grouping using the first occurence of a new category ?

Using the previous example, the result would be:

mylist[0].category # = "furniture"
mylist[1].category # = "furniture"
mylist[2].category # = "car"
mylist[3].category # = "car"
mylist[4].category # = "fruit"

Upvotes: 1

Views: 104

Answers (4)

dstromberg
dstromberg

Reputation: 7177

Perhaps something like this?

#!/usr/local/cpython-3.3/bin/python

import pprint


CATEGORY_FIRST_SEEN = {}


def extract_order(list_of_class):
    for index, element in enumerate(list_of_class):
        if element.category not in CATEGORY_FIRST_SEEN:
            CATEGORY_FIRST_SEEN[element.category] = index

    #pprint.pprint(CATEGORY_FIRST_SEEN)


class Class_with_category:
    def __init__(self, category):
        self.category = category

    def __cmp__(self, other):
        if CATEGORY_FIRST_SEEN[self.category] < CATEGORY_FIRST_SEEN[other.category]:
            return -1
        elif CATEGORY_FIRST_SEEN[self.category] > CATEGORY_FIRST_SEEN[other.category]:
            return 1
        else:
            return 0

    def __lt__(self, other):
        return self.__cmp__(other) < 0

    def __str__(self):
        return self.category

    __repr__ = __str__


def main():
    mylist = [ "furniture", "car", "fruit", "car", "furniture", ]
    list_of_class = [ Class_with_category(element) for element in mylist ]
    extract_order(list_of_class)
    list_of_class.sort()
    pprint.pprint(list_of_class)


main()

I've tested it to work on cpython 3.3, but I believe it should work on 2.x or 3.x.

Upvotes: 0

Hugh Bothwell
Hugh Bothwell

Reputation: 56654

# create a first-appearance index
order = {}
for ndx,item in enumerate(mylist):
    if item.category not in order:
        order[item.category] = ndx

# sort by that index
mylist.sort(key=lambda i: order[i])

Upvotes: 1

behzad.nouri
behzad.nouri

Reputation: 77951

you may achieve this by traversing the list twice ( no sorting ):

from collections import defaultdict

# put all the items of the same category together
l = defaultdict( list )
for x in mylist:
    l[ x.category ].append( x )

# expand in the order categories appear in the list
xs = [ ]
for x in mylist:
    xs.extend( l[ x.category ] )
    l.pop( x.category )

Upvotes: 0

Stuart
Stuart

Reputation: 9858

First, get a list of the categories in the same order as my_list. Then, sort my_list according to the position of the first appearance of each item's category in the list of categories.

categories = [item.category for item in my_list]
my_list.sort(key = lambda item: categories.index(item.category))

Upvotes: 2

Related Questions