Double counting loop (n*n code performance)

Question

In Python, I have a table like this, in the form of a list of lists

A: 8
B: 6
C: 8
D: 3
E: 4
F: 5
G: 7

I am trying to get, for each line, the number of "neighbours" e.g. the number of lines for which the number is either the same number, -1 or +1. Here it would be:

A: 3: 8
B: 3: 6
C: 3: 8
D: 2: 3
E: 3: 5
F: 3: 5
G: 4: 7

I have a code that works, consisting of a double loop: for each line, looping through all the lines and counting all the lines that comply

This works but for thousands of lines, I end up with an n*n performance problem. Any general way of improving this?

Jim Mischel · Accepted Answer

So the number of neighbors that a line with "8" has is the sum of:

Number of 7's
Number of 8's
Number of 9's

The simple solution is:

Create a dictionary that's indexed by count, with the value being the number of lines with that count.

Go through your table. For each line, add one to the corresponding dictionary count value. So, when you see the "A" line, you add 1 to the count for key 8. When you see the "B" line, you add 1 to the count for key 6, etc. When you're done, the count for key 8 will be 2. Key 6 will have count 1, etc.

Go through your table again. For each item, output the key, its count, and the sums from the three dictionary values: dictionary[count] + dictionary[count-1] + dictionary[count+1].

The algorithm is O(n), with O(n) extra space (for the dictionary).

Double counting loop (n*n code performance)

Answers (1)

Related Questions