Apply similar sort of elements across multiple lists (clustering)

Question

I have

dictionary spots that contains number of spots per position per day (e.g. on day 0, position 'a' has 3 spots), and
dictionary names that contains available names per position per day (e.g. on day 0, position 'a' is occupied by John, Claire and Billy)

Sample

import pandas as pd

spots = {
    0: {'a': 3, 'b': 3},
    1: {'a': 3, 'b': 3},
    2: {'a': 1},
    3: {'a': 3, 'b': 3},
    4: {'a': 4, 'b': 3},
    }

names = {
    0: {'a': ['John', 'Claire', 'Billy'], 'b': ['Paul']},
    1: {'a': ['John', 'Billy', 'Claire']},
    2: {'a': ['Billy']},
    3: {'a': ['Claire', 'Billy'], 'b': ['Paul', 'Peter']},
    4: {'a': ['Anna', 'Claire', 'Billy'], 'b': ['Peter']},
    }

I would like to sort names in the lists so that the position is the same whenever possible (e.g. Billy is available on all days and must therefore be put in the first position, unlike Anna who is available on the least amount of days and must be put at the end of names)

Expected outcome

output = {
    0: {'a': ['Billy', 'Claire', 'John'], 'b': ['Paul', '', '']},
    1: {'a': ['Billy', 'Claire', 'John'], 'b': ['', '', '']},
    2: {'a': ['Billy']},
    3: {'a': ['Billy', 'Claire'], 'b': ['Paul', 'Peter', '']},
    4: {'a': ['Billy', 'Claire', 'Anna', ''], 'b': ['Peter', '', '']},
    }

Own solution

def name_per_category(names):
    unique_names = {'a': set(), 'b': set()}
    for key in names:
        for category in names[key]:
            unique_names[category].update(names[key][category])
    return {category: sorted(unique_names[category]) for category in unique_names}


def sort_names(spots, names):
    output = {}
    sorted_names = name_per_category(names)

    for key in spots:
        output[key] = {}
        for category in spots[key]:
            sorted_list = [''] * spots[key][category]

            if key in names and category in names[key]:
                for name in names[key][category]:
                    # Find the index for each name
                    index = sorted_names[category].index(name)
                    print(index, name)
                    sorted_list[index-1] = name
            output[key][category] = sorted_list
    
    return output

output = sort_names(spots, names)

Wrong output

{0: {'a': ['Billy', 'Claire', 'John'], 'b': ['', '', 'Paul']},
 1: {'a': ['Billy', 'Claire', 'John'], 'b': ['', '', '']},
 2: {'a': ['Billy']},
 3: {'a': ['Billy', 'Claire', ''], 'b': ['Peter', '', 'Paul']},
 4: {'a': ['Billy', 'Claire', '', 'Anna'], 'b': ['Peter', '', '']}}

Other than wrong outcome, my logic seems quite heavy. Is there better way to reason about this type of problem?

Apply similar sort of elements across multiple lists (clustering)

Answers (1)

Related Questions