Reputation: 867

How can I optimize these nested loops?

Can anyone optimize this code block? It is working but running very slow.

maxsat = 0
possiblevotes = []
for i in range(1,int(numcats)+1):
    for j in range(1,int(numdogs)+1):
        possiblevotes.append('C' + str(i) + ' ' + 'D' + str(j))
        possiblevotes.append('D' + str(j) + ' ' + 'C' + str(i))
for m in possiblevotes:
    count = 0
    for n in votes:
        if m == n:
            count += 1
        elif m.split()[0] == n.split()[0]:
            count += 1
    if count > maxsat:
        maxsat = count

Upvotes: 1

Answers (4)

gukoff

Reputation: 2250

import re

vote_pattern = re.compile('^(C|D)\d+\s')

votes = ['123', 'A1123', 'cC32', 'C', 'D0', 'C11']
maxsat = sum(0 if vote_pattern.match(vote) is None else 1 for vote in votes)

You can change this awful sum to something like filter, of course.

Upvotes: 0

Martijn Pieters

Reputation: 1125078

There is no need to generate all possible votes. You can test your actual votes without having to generate the possiblevotes list, because you can easily calculate if an existing vote is possible or not.

You also only really count the 'stay' votes. It doesn't matter that you look for matching 'stay go' votes, because any 'stay go' vote for which m == n is true, m.split()[0] == n.split()[0] is also true. So you may as well drop that first count, and only look at the second.

Now you are just finding the maximum count for the stay votes. Using a collections.Counter() makes counting things easier:

import collections

vote_counts = collections.Counter(v.split()[0] for v in votes)

maxsat = vote_counts.most_common(1)[0][1]  # retrieve the most popular count

This calculates the same number your code calculated, but now we only have to loop over the votes just once, and only count 'stay' votes.

Contrast this with your loop, where you first loop numcats * numdogs times, then loop numcats * numdogs * 2 * len(votes) times. That a factor of 3 * numcats * numdogs larger.

If you have to validate votes first, you can use:

from itertools import ifilter

numcats = int(numcats)
numdogs = int(numdogs)

def validvote(vote):
    stay, go = vote.split()
    cat, dog = sorted((stay, go))
    if (cat[0], dog[0]) != ('C', 'D'):
        return False
    if not (1 >= int(cat[1:]) >= numcats):
        return False
    if not (1 >= int(dog[1:]) >= numdogs):
        return False
    return True

vote_counts = collections.Counter(v.split()[0] for v in ifilter(validvote, votes))

You could also start using the go votes:

stay_votes = collections.Counter()
go_votes = collections.Counter()

for vote in ifilter(validvote, votes):
    stay, go = vote.split()
    stay_votes[stay] += 1
    go_votes[go] += 1

Now you can simply subtract the go votes from the stay vote tally (any tally falling to 0 is removed):

total_votes = stay_votes - go_votes

# Display top 10
for creature, tally in total_votes.most_common(10):
    print('{}: {:>#5d}'.format(creature, tally))

Of course, you could also do the calculation in one go:

total_votes = collections.Counter()

for vote in ifilter(validvote, votes):
    stay, go = vote.split()
    total_votes[stay] += 1
    total_votes[go] -= 1

but keeping the vote tallies separate might be interesting for later analysis.

Upvotes: 1

Hai Vu

Reputation: 40783

The code takes a long time because of the nested loops. If you have 1000 cats, 1000 dogs, and 1000 votes, then the first set of loops runs 1000x1000 times; the second set runs 1000x1000x1000 times. If we can remove nested loops, then your code will run faster.

I noticed you seems to tally the votes, in which 'C1 D3' is the same as 'D3 C1'. I suggest you use the Counter class in the collections module to do the heavy lifting. Here is my solution:

import collections

if __name__ == '__main__':
    votes = ['C1 D3', 'D1 C5', 'D3 C1', 'd1 c1', 'c1 d3'] # Example votes

    # Normalize the votes: 'D3 C1' becomes 'C1 D3',
    # 'c1 d3' becomes 'C1 D3'
    normalized_votes = [''.join(sorted(v.upper().split())) for v in votes]

    # Count the votes
    counter = collections.Counter(normalized_votes)

    # Top 10
    print '--- TOP 10 ---'
    for vote, count in counter.most_common(10):
        print count, vote

    # Or print all
    print '--- ALL ---'
    for vote, count in counter.iteritems():
        print count, vote

Discussion

This solution uses 4 loops: the first is to derive normalized_votes, the second is to create the counter variable. The last two loops deal with printing out the result. None of these loops are nested. One might argue that the implementation of the Counter class might contain nested loops, but I trust that this class is implemented as efficient as possible.
One important step is to normalize the votes, which should greatly simplify your tally. While I have done it in one line, you can break it up into several steps to aid understanding.
1. The first step is to convert all the votes to upper case: 'd3 c1' becomes 'D3 C1'.
2. Next, I break them up into a list with the split() function: 'D3 C1' becomes ['D3', 'C1'].
3. The third step is to sort each item: ['D3', 'C1'] becomes ['C1', 'D3']
4. The final step glue them back together: ['C1', 'D3'] becomes 'C1 D3'

Upvotes: 0

f p

Reputation: 3223

Use a dictionary instead of a list:

possiblevotes = {}
for i in range(1,int(numcats)+1):
    for j in range(1,int(numdogs)+1):
        possiblevotes['C' + str(i) + ' ' + 'D' + str(j)] = 0
        possiblevotes['D' + str(j) + ' ' + 'C' + str(i)] = 0
for n in votes:
    possiblevotes[n] += 1
....

Upvotes: 0

How can I optimize these nested loops?

Answers (4)

Discussion

Related Questions