Anders
Anders

Reputation: 115

Guidance on removing a nested for loop from function

I'm trying to write the fastest algorithm possible to return the number of "magic triples" (i.e. x, y, z where z is a multiple of y and y is a multiple of x) in a list of 3-2000 integers.

(Note: I believe the list was expected to be sorted and unique but one of the test examples given was [1,1,1] with the expected result of 1 - that is a mistake in the challenge itself though because the definition of a magic triple was explicitly noted as x < y < z, which [1,1,1] isn't. In any case, I was trying to optimise an algorithm for sorted lists of unique integers.)

I haven't been able to work out a solution that doesn't include having three consecutive loops and therefore being O(n^3). I've seen one online that is O(n^2) but I can't get my head around what it's doing, so it doesn't feel right to submit it.

My code is:

def solution(l):
    if len(l) < 3:
        return 0
    elif l == [1,1,1]:
        return 1
    else:
        halfway = int(l[-1]/2)
        quarterway = int(halfway/2)
        quarterIndex = 0
        halfIndex = 0
        for i in range(len(l)):
          if l[i] >= quarterway:
            quarterIndex = i
            break
        for i in range(len(l)):
          if l[i] >= halfway:
            halfIndex = i
            break
        triples = 0
        for i in l[:quarterIndex+1]:
            for j in l[:halfIndex+1]:
                if j != i and j % i == 0:
                    multiple = 2
                    while (j * multiple) <= l[-1]:
                        if j * multiple in l:
                            triples += 1
                        multiple += 1
    return triples  

I've spent quite a lot of time going through examples manually and removing loops through unnecessary sections of the lists but this still completes a list of 2,000 integers in about a second where the O(n^2) solution I found completes the same list in 0.6 seconds - it seems like such a small difference but obviously it means mine takes 60% longer.

Am I missing a really obvious way of removing one of the loops?

Also, I saw mention of making a directed graph and I see the promise in that. I can make the list of first nodes from the original list with a built-in function, so in principle I presume that means I can make the overall graph with two for loops and then return the length of the third node list, but I hit a wall with that too. I just can't seem to make progress without that third loop!!

Upvotes: 0

Views: 267

Answers (4)

Anders
Anders

Reputation: 115

This answer was the first suggestion by @alaniwi and is the one I've found to be the fastest (at 0.59 seconds for a 2,000 integer list).

def solution(l):
    n = len(l)
    lower_counts = dict((val, 0) for val in l)
    upper_counts = lower_counts.copy()
    for i in range(n - 1):
        lower = l[i]
        for j in range(i + 1, n):
            upper = l[j]
            if upper % lower == 0:
                lower_counts[lower] += 1
                upper_counts[upper] += 1        

    return sum((lower_counts[y] * upper_counts[y] for y in l))

I think I've managed to get my head around it. What it is essentially doing is comparing each number in the list with every other number to see if the smaller is divisible by the larger and makes two dictionaries:

  • One with the number of times a number is divisible by a larger number,
  • One with the number of times it has a smaller number divisible by it.

You compare the two dictionaries and multiply the values for each key because the key having a 0 in either essentially means it is not the second number in a triple.

Example:

l = [1,2,3,4,5,6]
lower_counts = {1:5, 2:2, 3:1, 4:0, 5:0, 6:0}
upper_counts = {1:0, 2:1, 3:1, 4:2, 5:1, 6:3}
triple_tuple = ([1,2,4], [1,2,6], [1,3,6])

Upvotes: 0

Fanchen Bao
Fanchen Bao

Reputation: 4289

@alaniwi provided a very smart iterative solution.

Here is a recursive solution.

def find_magicals(lst, nplet):
    """Find the number of magical n-plets in a given lst"""
    res = 0
    for i, base in enumerate(lst):
        # find all the multiples of current base
        multiples = [num for num in lst[i + 1:] if not num % base]
        res += len(multiples) if nplet <= 2 else find_magicals(multiples, nplet - 1)
    return res


def solution(lst):
    return find_magicals(lst, 3)

The problem can be divided into selecting any number in the original list as the base (i.e x), how many du-plets we can find among the numbers bigger than the base. Since the method to find all du-plets is the same as finding tri-plets, we can solve the problem recursively.

From my testing, this recursive solution is comparable to, if not more performant than, the iterative solution.

Upvotes: 0

LevB
LevB

Reputation: 953

How about using itertools.combinations instead of nested for loops? Combined with list comprehension, it's cleaner and much faster. Let's say l = [your list of integers] and let's assume it's already sorted.

from itertools import combinations

def div(i,j,k): # this function has the logic
    return l[k]%l[j]==l[j]%l[i]==0

r = sum([div(i,j,k) for i,j,k in combinations(range(len(l)),3) if i<j<k])

Upvotes: 0

alani
alani

Reputation: 13079

from array import array

def num_triples(l):
    n = len(l)
    pairs = set()
    lower_counts = array("I", (0 for _ in range(n)))
    upper_counts = lower_counts[:]
    for i in range(n - 1):
        lower = l[i]
        for j in range(i + 1, n):
            upper = l[j]
            if upper % lower == 0:
                lower_counts[i] += 1
                upper_counts[j] += 1        
    return sum(nx * nz for nz, nx in zip(lower_counts, upper_counts))

Here, lower_counts[i] is the number of pairs of which the ith number is the y, and z is the other number in the pair (i.e. the number of different z values for this y).

Similarly, upper_counts[i] is the number of pairs of which the ith number is the y, and x is the other number in the pair (i.e. the number of different x values for this y).

So the number of triples in which the ith number is the y value is just the product of those two numbers.

The use of an array here for storing the counts is for scalability of access time. Tests show that up to n=2000 it makes negligible difference in practice, and even up to n=20000 it only made about a 1% difference to the run time (compared to using a list), but it could in principle be the fastest growing term for very large n.

Upvotes: 4

Related Questions