Reputation: 5167
I have a group of people, and I want each person to have a 1:1 meeting with every other person in the group. A given person can only meet with one other person at a time, so I want to do the following:
To demonstrate the problem in terms of desired input/output, let's say I have the following list:
>>> people = ['Dave', 'Mary', 'Susan', 'John']
I want to produce the following output:
>>> for round in make_rounds(people):
>>> print(round)
[('Dave', 'Mary'), ('Susan', 'John')]
[('Dave', 'Susan'), ('Mary', 'John')]
[('Dave', 'John'), ('Mary', 'Susan')]
If I had an odd number of people, then I would expect this result:
>>> people = ['Dave', 'Mary', 'Susan']
>>> for round in make_rounds(people):
>>> print(round)
[('Dave', 'Mary')]
[('Dave', 'Susan')]
[('Mary', 'Susan')]
The key to this problem is that I need my solution to be performant (within reason). I've written code that works, but as the size of people
grows it becomes exponentially slow. I don't know enough about writing performant algorithms to know whether my code is inefficient, or whether I'm simply bound by the parameters of the problem
Step 1 is easy: I can get all possible pairings using itertools.combinations
:
>>> from itertools import combinations
>>> people_pairs = set(combinations(people, 2))
>>> print(people_pairs)
{('Dave', 'Mary'), ('Dave', 'Susan'), ('Dave', 'John'), ('Mary', 'Susan'), ('Mary', 'John'), ('Susan', 'John')}
To work out the rounds themselves, I'm building a round like so:
round
listpeople_pairs
set calculated using the combinations
method aboveround
that already contain that individualpeople_pairs
list.rounds
listpeople_pairs
now contains only the pairs that didn't make it into the first roundEventually this produces the desired result, and whittles down my people pairs until there are none left and all the rounds are calculated. I can already see that this is requiring a ridiculous number of iterations, but I don't know a better way of doing this.
Here's my code:
from itertools import combinations
# test if person already exists in any pairing inside a round of pairs
def person_in_round(person, round):
is_in_round = any(person in pair for pair in round)
return is_in_round
def make_rounds(people):
people_pairs = set(combinations(people, 2))
# we will remove pairings from people_pairs whilst we build rounds, so loop as long as people_pairs is not empty
while people_pairs:
round = []
# make a copy of the current state of people_pairs to iterate over safely
for pair in set(people_pairs):
if not person_in_round(pair[0], round) and not person_in_round(pair[1], round):
round.append(pair)
people_pairs.remove(pair)
yield round
Plotting out the performance of this method for list sizes of 100-300 using https://mycurvefit.com shows that calculating rounds for a list of 1000 people would probably take around 100 minutes. Is there a more efficient way of doing this?
Note: I'm not actually trying to organise a meeting of 1000 people :) this is just a simple example that represents the matching / combinatorics problem I'm trying to solve.
Upvotes: 25
Views: 2379
Reputation: 1
Don’t make a copy of the set every time through the list. That’s a big waste of time/memory. Instead, modify the set once after each iteration.
Keep a separate set of people in each round. Looking up a person in a set will be an order of magnitude faster than looping through the entire round.
Upvotes: 0
Reputation: 45542
This is an implementation of the algorithm described in the Wikipedia article Round-robin tournament.
from itertools import cycle , islice, chain
def round_robin(iterable):
items = list(iterable)
if len(items) % 2 != 0:
items.append(None)
fixed = items[:1]
cyclers = cycle(items[1:])
rounds = len(items) - 1
npairs = len(items) // 2
return [
list(zip(
chain(fixed, islice(cyclers, npairs-1)),
reversed(list(islice(cyclers, npairs)))
))
for _ in range(rounds)
for _ in [next(cyclers)]
]
Upvotes: 17
Reputation: 1474
Two things you can do right off the bat:
Don’t make a copy of the set every time through the list. That’s a big waste of time/memory. Instead, modify the set once after each iteration.
Keep a separate set of people in each round. Looking up a person in a set will be an order of magnitude faster than looping through the entire round.
Ex:
def make_rounds(people):
people_pairs = set(combinations(people, 2))
while people_pairs:
round = set()
people_covered = set()
for pair in people_pairs:
if pair[0] not in people_covered \
and pair[1] not in people_covered:
round.add(pair)
people_covered.update(pair)
people_pairs -= round # remove thi
yield round
Upvotes: 3
Reputation: 23508
I generate just the indices (because I have trouble coming up with 1000 names =), but for 1000 numbers the runtime is about 4 seconds.
The main problem of all other approaches -- they use pairs and work with them, there are plenty of pairs, and the run time is getting much longer. My approach differs in working with people, not pairs. I have a dict()
that maps the person to the list of other persons (s)he has to meet, and these lists are at most N items long (not N^2, as with pairs). Hence the time savings.
#!/usr/bin/env python
from itertools import combinations
from collections import defaultdict
pairs = combinations( range(6), 2 )
pdict = defaultdict(list)
for p in pairs :
pdict[p[0]].append( p[1] )
while len(pdict) :
busy = set()
print '-----'
for p0 in pdict :
if p0 in busy : continue
for p1 in pdict[p0] :
if p1 in busy : continue
pdict[p0].remove( p1 )
busy.add(p0)
busy.add(p1)
print (p0, p1)
break
# remove empty entries
pdict = { k : v for k,v in pdict.items() if len(v) > 0 }
'''
output:
-----
(0, 1)
(2, 3)
(4, 5)
-----
(0, 2)
(1, 3)
-----
(0, 3)
(1, 2)
-----
(0, 4)
(1, 5)
-----
(0, 5)
(1, 4)
-----
(2, 4)
(3, 5)
-----
(2, 5)
(3, 4)
'''
Upvotes: 6
Reputation: 4069
Maybe I'm missing something (not entirely uncommon) but this sounds like a plain old round-robin tournament, where each team plays every other team exactly once.
There are O(n^2) methods to handle this "by hand", that work "by machine" just fine. One good description can be found in the Wikipedia article on Round-Robin Tournaments.
About that O(n^2): There will be either n-1 or n rounds, each requiring O(n) steps to rotate all but one table entry and O(n) steps to enumerate the n//2
matches in each round. You can make the rotation O(1) by using doubly-linked lists, but the enumeration of the matches is still O(n). So O(n)*O(n) = O(n^2).
Upvotes: 2
Reputation: 3491
When you need fast lookups hashes/dicts are the way to go. Keep track of whose been in each round in a dict
rather than a list
and it will be much faster.
Since you are getting in algorithms, studying big O notation will help you out and knowing what data structures are good at what kind of operations is also key. See this guide: https://wiki.python.org/moin/TimeComplexity for time complexity of Python built-ins. You'll see that checking for an item in a list is O(n) meaning that it scales linearly with the size of your inputs. So since it is in a loop you end up with O(n^2) or worse. For dicts, lookups are generally O(1) meaning that it doesn't matter the size of your input.
Also, don't override built-ins. I'd changed round
to round_
from itertools import combinations
# test if person already exists in any pairing inside a round of pairs
def person_in_round(person, people_dict):
return people_dict.get(person, False)
def make_rounds(people):
people_pairs = set(combinations(people, 2))
people_in_round = {}
# we will remove pairings from people_pairs whilst we build rounds, so loop as long as people_pairs is not empty
while people_pairs:
round_ = []
people_dict = {}
# make a copy of the current state of people_pairs to iterate over safely
for pair in set(people_pairs):
if not person_in_round(pair[0], people_dict) and not person_in_round(pair[1], people_dict):
round_.append(pair)
people_dict[pair[0]] = True
people_dict[pair[1]] = True
people_pairs.remove(pair)
yield round_
Upvotes: 3
Reputation: 31
This takes about 45s on my computer
def make_rnds(people):
people_pairs = set(combinations(people, 2))
# we will remove pairings from people_pairs whilst we build rnds, so loop as long as people_pairs is not empty
while people_pairs:
rnd = []
rnd_set = set()
peeps = set(people)
# make a copy of the current state of people_pairs to iterate over safely
for pair in set(people_pairs):
if pair[0] not in rnd_set and pair[1] not in rnd_set:
rnd_set.update(pair)
rnd.append(pair)
peeps.remove(pair[0])
peeps.remove(pair[1])
people_pairs.remove(pair)
if not peeps:
break
yield rnd
I dropped the function person_in_rnd
to reduce time lost to function calls, and added a variable called rnd_set and peeps. rnd_set is a set of everyone in the round so far and is used to check for matches against the pair. peeps is a copied set of people, every time we add a pair to rnd we remove that those individuals from peeps. This lets us stop iterating through all combinations once peeps is empty, that is, once everyone is in a round.
Upvotes: 1