lxyu
lxyu

Reputation: 2839

Algorithm - how to fast make sure elements unique from each group

Don't know how to explain the problem in a title.

Here's the problem:

Assume we have 4 group:

(a, b, c, d)
(e, f)
(g, h, i)
(j, k, l, m, n)

Now I'm given a 4 elements tuple, for example (a, e, h, m), neither 2 comes from one group, so I return True. If given (a, b, e, g), then a, b are from one group, return False.

Then here's my current idea, I give each element an id start with group number and test for duplicate.

g1 = ['1a', '1b', '1c', '1d']
g2 = ['2e', '2f']
g3 = ['3g', '3h', '3i']
g4 = ['4j', '4k', '4l', '4m', '4n']


def test(elements):
    if len(elements) != 4:
        return False

    stack = []
    for e in elements:
        mark = e[:1]
        if mark in stack:
            return False
        stack.append(mark)

    ga = set(g1 + g2 + g3 + g4)
    return set(elements).issubset(ga)


print test(('1a', '1b', '2e', '3g'))
print test(('1a', '2e', '3g', '4m'))

But I think the string compare is not a very elegant solution, can this be done by another faster algorithm?

Upvotes: 0

Views: 153

Answers (2)

monkut
monkut

Reputation: 43840

Just playing around in the interpreter, same as mgilson, but you don't need to check len(), an existing set() evaluates to True.

>>> g1 = ['1a', '1b', '1c', '1d']
>>> g2 = ['2e', '2f']
>>> g3 = ['3g', '3h', '3i']
>>> g4 = ['4j', '4k', '4l', '4m', '4n']
>>>
>>> groups = (g1, g2, g3, g4)
>>> t1 = ('1a', '1b', '2e', '3g')
>>> t2 = ('1a', '2e', '3g', '4m')
>>>
>>> all(set(t2).intersection(g) for g in groups)
True
>>> all(set(t1).intersection(g) for g in groups)
False
>>>

Upvotes: 2

mgilson
mgilson

Reputation: 309929

If all the elements are hashable, I would use a set.intersection:

g1 = set(['1a', '1b', '1c', '1d'])
g2 = set(['2e', '2f'])
g3 = set(['3g', '3h', '3i'])
g4 = set(['4j', '4k', '4l', '4m', '4n'])

sets = [g1,g2,g3,g4]
test_this = ['1a','2e','3g','4j']

all(len(s.intersection(test_this)) <= 1 for s in sets)

Alternatively, if you don't want to change the type of g1,g2 ... you can change the type of test_this:

g1 = ['1a', '1b', '1c', '1d']
g2 = ['2e', '2f']
g3 = ['3g', '3h', '3i']
g4 = ['4j', '4k', '4l', '4m', '4n']
lists = [g1,g2,g3,g4]
test_this = set(['1a','2e','3g','4j'])

all( len(test_this.intersection(lst)) <= 1 for lst in lists )

The beauty here is that all is smart enough to short-circuit -- and since we're using a generator expression, we don't need to calculate all the intersections up front. Python will only keep calculating the intersections as long as all the previous intersections had length less than or equal to 1.

Upvotes: 4

Related Questions