Jack
Jack

Reputation: 71

Permutations of a list where a condition is met?

Say I have a list of images ['1.jpg', '2.jpg', '3.jpg'...] and a corresponding list of vertical sizes [200, 400, 300...] I am trying to return all permutations of the list of images where the size lies within 10% of neighbouring elements. This is proving to be quite challenging!

I firstly tried itertools.permutations on the entire list and then iterated over each element and checked for validity. This worked, but when n > 12 it obviously becomes quite slow, and it seems inefficient to generate so many invalid permutations from the outset.

I then realised that the order is not especially important, as the images will be shown in a loop, so I fixed the first element in the list, but this still required me to permute every element, so again, inefficient.

I then went searching for another approach, and decided to try this:

images = ['1.jpg', '2.jpg', '3.jpg', '4.jpg', '5.jpg']
v_size = [100, 125, 110, 120, 95]
pairs = list(itertools.permutations(images,2))

This yielded a list of all possible pairings of images, which I could then validate and filter down to only pairs which met the +/- 10% criteria, so I was ultimately left with the following set of valid pairings meeting my criteria:

[('1.jpg', '3.jpg'), ('1.jpg', '5.jpg'), ('2.jpg', '4.jpg'), ('3.jpg', '1.jpg'), ('3.jpg', '4.jpg'), ('4.jpg', '2.jpg'), ('4.jpg', '3.jpg'), ('5.jpg', '1.jpg')]

Inspecting it, it seems to make sense. So images 1 and 3 can sit next to each other, and so can 3 and 4, 4 and 2 etc. So what I am looking to generate is a recombination of these to find all (if any) valid permutations of the original images. So for example:

['5.jpg', '1.jpg', '3.jpg', '4.jpg', '2.jpg']

Would be a valid arrangement, whereas:

['1.jpg', '2.jpg', '3.jpg', '4.jpg', '5.jpg']

Would not, as the images sizes lie outside of 10% size constraint of each other.

I have considered recursion, but I am quite new to this and am not sure where to start.

Would appreciate any advice on how to solve this problem, as I have been trying for days!

Upvotes: 7

Views: 1071

Answers (2)

lenik
lenik

Reputation: 23508

The solution is quite simple. Firstly, you sort your vertical sizes, so it becomes easier to understand the principle (it works without the sorting all the same, but harder to debug or figure out what's wrong).

Once sorted, you build the matrix, that contains 1 for every possible transition between the pictures, and 0 otherwise.

Then, it's simple exhaustive recursive search for the possible jumps between the rows:

import numpy as np

v_size = [95, 100, 102, 110, 120, 125]

ss = sorted(v_size)
num = len(v_size)

sq = np.zeros([num,num]).astype(int)

for i in range(num) :
    for j in range(num) :
        if i == j : continue

        if abs(ss[i]-ss[j]) < (ss[i]+ss[j])/20.0 :  # 10%
            sq[i,j] = sq[j,i] = 1

print sq

current = []

def search( row ) :
    if len(current) == num :
        print current
        return
    for i in range(num) :
        if sq[row,i] > 0 and i not in current :
            current.append(i)
            search(i)
            current.pop()
    return

for i in range(num) :
    current.append(i)
    search(i)
    current.pop()

prints out this, connectivity matrix first and possible sequences later.

'''
[[0 1 1 0 0 0]
 [1 0 1 1 0 0]
 [1 1 0 1 0 0]
 [0 1 1 0 1 0]
 [0 0 0 1 0 1]
 [0 0 0 0 1 0]]

[0, 1, 2, 3, 4, 5]
[0, 2, 1, 3, 4, 5]
[1, 0, 2, 3, 4, 5]
[2, 0, 1, 3, 4, 5]
[5, 4, 3, 1, 0, 2]
[5, 4, 3, 1, 2, 0]
[5, 4, 3, 2, 0, 1]
[5, 4, 3, 2, 1, 0]
'''

8 results for 10% tolerance, 96 results for 20% tolerance (not printed here, because it's about 100 lines long =).

Upvotes: 0

Prune
Prune

Reputation: 77837

This is a graph problem. Each node has a bidirectional edge to any node whose size is within 10%. Finding all permutations with your restrictions is isomorphic to finding all Hamiltonian paths in this graph.

First, you need to build the graph; use your favorite graph package or simpler representation. This pre-processing is necessarily a O(n^2), as the quantity of edges may be n(n-1)/2. However, in practical cases, you can sort the list by size first, which should give you an effective O(n log n) sort and O(n) graph building, as there will be only a handful of connections from each node to its neighbors. For instance, in your given mini-example, I'll add another file, so we have:

# Before sorting
images = ['1.jpg', '2.jpg', '3.jpg', '4.jpg', '5.jpg', '6.jpg']
v_size = [100, 125, 110, 120, 95, 102]

# After sorting
images = ['5.jpg', '1.jpg', '6.jpg', '3.jpg', '4.jpg', '2.jpg']
v_size = [95, 100, 102, 110, 120, 125]

From here, keep an upper-bound marker, mark

  • Start at 95; make the edge to 100. Set mark = 1 (index of 100)
  • Move to 100; make the edges to 102 & 110. mark = 3
  • Move to 102; make the edge to 110 without checking; we already know it's within range, because of mark. Check against 120, which is too far.
  • Move to 110; make the edge to 120 ... you get the idea from here.

Then you get to look up the algorithms available for an exhaustive traversal of the graph, i.e. a Hamiltonian path.

Upvotes: 3

Related Questions