Michael
Michael

Reputation: 89

List comparison of element

I have a question and it is a bit hard for me to explain so I will be using lots of examples to help you all understand and see if you could help me.

Say I have two lists containing book names from best to worst rated by two people. User1 rated lstA, and user2 rated lstB

lstA = ['Harry Potter','1984','50 Shades','Dracula']
lstB = ['50 Shades','Dracula','1984','Harry Potter']

User one thinks 'Harry Potter' is better than 'Dracula' (HP is index 0, and Dracula is index 3)

User two thinks 'Harry Potter' is worse than Dracula, (HP is index 3 and Dracula is index 1)

In this case, return a tuple ('Harry Potter', 'Dracula') [('Dracula', 'Harry Potter') is also fine]

User one also rated '50 shades' better than 'Dracula' and user two also rated '50 shades' better than 'Dracula' (index 2, 3 and 0, 1 respectively). In this case, nothing happens.

The final result of the program should return a list of tuples so,

[('Harry Potter','50 Shades'), ('Harry Potter','Dracula'), ('Harry Potter','1984'), ('1984', '50 Shades'), ('1984','Dracula')]

Could someone help me to point me in the right direction to come up with an algorithm that gives all the tuples?

Upvotes: 6

Views: 268

Answers (4)

Mad Physicist
Mad Physicist

Reputation: 114230

One way to do this would be to accumulate all the positive orderings form each list into a set, then take the difference of the two sets. The positive ordering would be (a, b) when the a precedes b in its respective list. This is the ordering guaranteed by itertools.combinations:

from itertools import combinations

setA = set(combinations(lstA, 2))
setB = set(combinations(lstB, 2))

result = setA - setB

This would simply discard any orderings that the two sets agree on. If both lists had the same books, this would be almost identical to

result = setB - setA

The only difference would be that all the tuples would be reversed.

If you had different books in each list, you would need to add a couple of extra steps to clean up the duplicates and combine the two sets:

resultA = setA - setB
resultB = setB.difference(x[::-1] for x in setA)
result = resultA | resultB

The first step computes all the elements from lstA that lstB does not agree with. The next step finds the elements of lstB that aren't reversed versions of what we have in resultA, since the disagreements over books in both lists are guaranteed to be reversed in the sets. I used the method set.difference here in preference to the - operator because that way there is no need to create a set object from the generator expression. You can't just use symmetric_difference/^ unfortunately because the elements are reversed. The third step just computes the union of the results.

IDEOne Link: https://ideone.com/DuHTed. This demos both the original case in the question and the asymmetric lists.

Upvotes: 3

vash_the_stampede
vash_the_stampede

Reputation: 4606

You can use iter and then compare indices

res = []  

for i in lstA:
    a = iter(lstB)
    while True:
        try:
            b = next(a)
            if lstA.index(i) < lstA.index(b) and lstB.index(i) > lstB.index(b):
                res.append((i, b))
        except StopIteration:
            break 

print(res)
# [('Harry Potter', '50 Shades'), ('Harry Potter', 'Dracula'), ('Harry Potter', '1984'), ('1984', '50 Shades'), ('1984', 'Dracula')]

Upvotes: 0

slider
slider

Reputation: 12990

An efficient version of @jpp's solution is as follows:

from itertools import combinations

lstA = ['Harry Potter','1984','50 Shades','Dracula']
lstB = ['50 Shades','Dracula','1984','Harry Potter']

bIndices = {b: i for i, b in enumerate(lstB)}
aPairs = [sorted(c) for c in combinations(enumerate(lstA), 2)]

mismatches = [(book1[1], book2[1]) for book1, book2 in aPairs if bIndices[book1[1]] > bIndices[book2[1]]]
print(mismatches)
# [('Harry Potter', '1984'), ('Harry Potter', '50 Shades'), ('Harry Potter', 'Dracula'), ('1984', '50 Shades'), ('1984', 'Dracula')]

Note that aPairs are combinations of (index, book) tuples and each combination is sorted by the index which guarantees that in each pair of books, the first is "better" than the next (for user A).

Now to compute ordering mismatches, we just need to decide whether the corresponding book indices in lstB also preserve this ordering.

Edit

As @MadPhysicist noted, combinations preserves the original order in the array in each generated tuple, so no need to create aPairs as a list of sorted (index, book) tuples. We can directly generate mismatches with just bIndices:

lstA = ['Harry Potter','1984','50 Shades','Dracula']
lstB = ['50 Shades','Dracula','1984','Harry Potter']

bIndices = {b: i for i, b in enumerate(lstB)}
mismatches = [(book1, book2) for book1, book2 in combinations(lstA, 2) if bIndices[book1] > bIndices[book2]]

Upvotes: 2

jpp
jpp

Reputation: 164623

First formulate your logic mathematically. For all combinations of length 2, given indices idx_a1, idx_a2 and idx_b1, idx_b2, if sign(idx_a1 - idx_a2) != sign(idx_b1 - idx_b2), record the combination.

The below isn't efficient, but it shows one way of transforming this logic to code:

from itertools import combinations

lstA = ['Harry Potter','1984','50 Shades','Dracula']
lstB = ['50 Shades','Dracula','1984','Harry Potter']

def sign(x):
    """Return +1 if integer is positive, -1 if negative"""
    return (x > 0) - (x < 0)

res = []
for a, b in combinations(lstA, 2):
    idx_a1, idx_a2 = lstA.index(a), lstA.index(b)
    idx_b1, idx_b2 = lstB.index(a), lstB.index(b)
    if sign(idx_a1 - idx_a2) != sign(idx_b1 - idx_b2):
        res.append((a, b))

[('Harry Potter', '1984'),
 ('Harry Potter', '50 Shades'),
 ('Harry Potter', 'Dracula'),
 ('1984', '50 Shades'),
 ('1984', 'Dracula')]

Upvotes: 4

Related Questions