Reputation: 867
Can anyone optimize this code block? It is working but running very slow.
maxsat = 0
possiblevotes = []
for i in range(1,int(numcats)+1):
for j in range(1,int(numdogs)+1):
possiblevotes.append('C' + str(i) + ' ' + 'D' + str(j))
possiblevotes.append('D' + str(j) + ' ' + 'C' + str(i))
for m in possiblevotes:
count = 0
for n in votes:
if m == n:
count += 1
elif m.split()[0] == n.split()[0]:
count += 1
if count > maxsat:
maxsat = count
Upvotes: 1
Views: 387
Reputation: 2250
import re
vote_pattern = re.compile('^(C|D)\d+\s')
votes = ['123', 'A1123', 'cC32', 'C', 'D0', 'C11']
maxsat = sum(0 if vote_pattern.match(vote) is None else 1 for vote in votes)
You can change this awful sum to something like filter, of course.
Upvotes: 0
Reputation: 1125078
There is no need to generate all possible votes. You can test your actual votes without having to generate the possiblevotes
list, because you can easily calculate if an existing vote is possible or not.
You also only really count the 'stay' votes. It doesn't matter that you look for matching 'stay go' votes, because any 'stay go' vote for which m == n
is true, m.split()[0] == n.split()[0]
is also true. So you may as well drop that first count, and only look at the second.
Now you are just finding the maximum count for the stay
votes. Using a collections.Counter()
makes counting things easier:
import collections
vote_counts = collections.Counter(v.split()[0] for v in votes)
maxsat = vote_counts.most_common(1)[0][1] # retrieve the most popular count
This calculates the same number your code calculated, but now we only have to loop over the votes just once, and only count 'stay' votes.
Contrast this with your loop, where you first loop numcats * numdogs
times, then loop numcats * numdogs * 2 * len(votes)
times. That a factor of 3 * numcats * numdogs
larger.
If you have to validate votes first, you can use:
from itertools import ifilter
numcats = int(numcats)
numdogs = int(numdogs)
def validvote(vote):
stay, go = vote.split()
cat, dog = sorted((stay, go))
if (cat[0], dog[0]) != ('C', 'D'):
return False
if not (1 >= int(cat[1:]) >= numcats):
return False
if not (1 >= int(dog[1:]) >= numdogs):
return False
return True
vote_counts = collections.Counter(v.split()[0] for v in ifilter(validvote, votes))
You could also start using the go votes:
stay_votes = collections.Counter()
go_votes = collections.Counter()
for vote in ifilter(validvote, votes):
stay, go = vote.split()
stay_votes[stay] += 1
go_votes[go] += 1
Now you can simply subtract the go votes from the stay vote tally (any tally falling to 0 is removed):
total_votes = stay_votes - go_votes
# Display top 10
for creature, tally in total_votes.most_common(10):
print('{}: {:>#5d}'.format(creature, tally))
Of course, you could also do the calculation in one go:
total_votes = collections.Counter()
for vote in ifilter(validvote, votes):
stay, go = vote.split()
total_votes[stay] += 1
total_votes[go] -= 1
but keeping the vote tallies separate might be interesting for later analysis.
Upvotes: 1
Reputation: 40783
The code takes a long time because of the nested loops. If you have 1000 cats, 1000 dogs, and 1000 votes, then the first set of loops runs 1000x1000 times; the second set runs 1000x1000x1000 times. If we can remove nested loops, then your code will run faster.
I noticed you seems to tally the votes, in which 'C1 D3' is the same as 'D3 C1'. I suggest you use the Counter class in the collections module to do the heavy lifting. Here is my solution:
import collections
if __name__ == '__main__':
votes = ['C1 D3', 'D1 C5', 'D3 C1', 'd1 c1', 'c1 d3'] # Example votes
# Normalize the votes: 'D3 C1' becomes 'C1 D3',
# 'c1 d3' becomes 'C1 D3'
normalized_votes = [''.join(sorted(v.upper().split())) for v in votes]
# Count the votes
counter = collections.Counter(normalized_votes)
# Top 10
print '--- TOP 10 ---'
for vote, count in counter.most_common(10):
print count, vote
# Or print all
print '--- ALL ---'
for vote, count in counter.iteritems():
print count, vote
This solution uses 4 loops: the first is to derive normalized_votes, the second is to create the counter variable. The last two loops deal with printing out the result. None of these loops are nested. One might argue that the implementation of the Counter
class might contain nested loops, but I trust that this class is implemented as efficient as possible.
One important step is to normalize the votes, which should greatly simplify your tally. While I have done it in one line, you can break it up into several steps to aid understanding.
split()
function: 'D3 C1' becomes ['D3', 'C1']. Upvotes: 0
Reputation: 3223
Use a dictionary instead of a list:
possiblevotes = {}
for i in range(1,int(numcats)+1):
for j in range(1,int(numdogs)+1):
possiblevotes['C' + str(i) + ' ' + 'D' + str(j)] = 0
possiblevotes['D' + str(j) + ' ' + 'C' + str(i)] = 0
for n in votes:
possiblevotes[n] += 1
....
Upvotes: 0