Jannik
Jannik

Reputation: 95

Sorting People into Groups based on Votes

I have a problem with finding a algorithm for sorting a dataset of people. I try to explain as detailed as possible:

The story starts with a survey. A bunch of people, lets say 600 can choose between 20-25 projects. They make a #1-wish, #2-wish and #3-wish, where #1 is the most wanted project they want to take part and wish 3 the "not-perfect-but-most-acceptable-choose".

These project are limited in their number of participants. Every project can join around 30 people (based on the number of people and count of projects).

The algorithm puts the people in the different projects and should find the best possible combination.

The problem is that you can't just put all the people with their number 1 wish X in the certain project and stuff all the other with also number 1 wish X in there number 2 wish because that would not be the most "happiest" situation for everybody.

You may can think of what I mean when you imagine that for everybody who get his number 1 wish you get 100 points, for everybody who get his number 2 wish 60 points, number 3 wish 30 points and who get not in one of his wishes 0 points. And you want to get as most points as possible.

I hope you get my problem. This is for a school-project day. Is there something out there that could help me? Do you have any idea? I would be thankful for every tipp!!

Kind regards

Upvotes: 3

Views: 1792

Answers (2)

Peter de Rivaz
Peter de Rivaz

Reputation: 33509

You can solve this optimally by formulating it as a min cost network flow problem.

Add a node for each person, and one for each project.

Set cost for a flow between a person and a project according to their preferences.

(As Networkx provides a min cost flow, but not max cost flow I have set the costs to be negative.)

For example, using Networkx and Python:

import networkx as nx

G=nx.DiGraph()

prefs={'Tom':['Project1','Project2','Project3'],
       'Dick':['Project2','Project1','Project3'],
       'Harry':['Project1','Project3','Project1']}

capacities={'Project1':2,'Project2':10,'Project3':4}

num_persons=len(prefs)
G.add_node('dest',demand=num_persons)
A=[]
for person,projectlist in prefs.items():
    G.add_node(person,demand=-1)
    for i,project in enumerate(projectlist):
        if i==0:
            cost=-100 # happy to assign first choices
        elif i==1:
            cost=-60 # slightly unhappy to assign second choices
        else:
            cost=-30 # very unhappy to assign third choices
        G.add_edge(person,project,capacity=1,weight=cost) # Edge taken if person does this project

for project,c in capacities.items():
        G.add_edge(project,'dest',capacity=c,weight=0)

flowdict = nx.min_cost_flow(G)
for person in prefs:
    for project,flow in flowdict[person].items():
        if flow:
            print person,'joins',project

In this code Tom's number 1 choice is Project1, followed by Project2, then Project3.

The capacities dictionary specifies the upper limit on how many people can join each project.

Upvotes: 5

Anachronist
Anachronist

Reputation: 1102

My algorithm would be something like this:

mainloop
 wishlevel = 1
  loop
   Distribute people into all projects according to wishlevel wish
   loop through projects, counting population
    If population exceeds maximum
     Distribute excess non-redistributed people into their wishlevel+1 projects that are under-populated
     tag distributed people as 'redistributed' to avoid moving again
    endif
   endloop
  wishlevel = wishlevel + 1
 loop until wishlevel == 3    
mainloop until no project exceeds max population

This should make several passes through the data set until everything is evened out. This algorithm may result in an endless loop if you restrict redistribution of already-redistributed people in the event that one project fills up with such people as the algorithm progresses, so you might try it without that restriction.

Upvotes: 0

Related Questions