Reputation: 89
I have a question and it is a bit hard for me to explain so I will be using lots of examples to help you all understand and see if you could help me.
Say I have two lists containing book names from best to worst rated by two people. User1 rated lstA
, and user2 rated lstB
lstA = ['Harry Potter','1984','50 Shades','Dracula']
lstB = ['50 Shades','Dracula','1984','Harry Potter']
User one thinks 'Harry Potter' is better than 'Dracula' (HP is index 0, and Dracula is index 3)
User two thinks 'Harry Potter' is worse than Dracula, (HP is index 3 and Dracula is index 1)
In this case, return a tuple ('Harry Potter', 'Dracula')
[('Dracula', 'Harry Potter')
is also fine]
User one also rated '50 shades' better than 'Dracula' and user two also rated '50 shades' better than 'Dracula' (index 2, 3 and 0, 1 respectively). In this case, nothing happens.
The final result of the program should return a list of tuples so,
[('Harry Potter','50 Shades'), ('Harry Potter','Dracula'), ('Harry Potter','1984'), ('1984', '50 Shades'), ('1984','Dracula')]
Could someone help me to point me in the right direction to come up with an algorithm that gives all the tuples?
Upvotes: 6
Views: 268
Reputation: 114230
One way to do this would be to accumulate all the positive orderings form each list into a set, then take the difference of the two sets. The positive ordering would be (a, b)
when the a
precedes b
in its respective list. This is the ordering guaranteed by itertools.combinations
:
from itertools import combinations
setA = set(combinations(lstA, 2))
setB = set(combinations(lstB, 2))
result = setA - setB
This would simply discard any orderings that the two sets agree on. If both lists had the same books, this would be almost identical to
result = setB - setA
The only difference would be that all the tuples would be reversed.
If you had different books in each list, you would need to add a couple of extra steps to clean up the duplicates and combine the two sets:
resultA = setA - setB
resultB = setB.difference(x[::-1] for x in setA)
result = resultA | resultB
The first step computes all the elements from lstA
that lstB
does not agree with. The next step finds the elements of lstB
that aren't reversed versions of what we have in resultA
, since the disagreements over books in both lists are guaranteed to be reversed in the sets. I used the method set.difference
here in preference to the -
operator because that way there is no need to create a set object from the generator expression. You can't just use symmetric_difference
/^
unfortunately because the elements are reversed. The third step just computes the union of the results.
IDEOne Link: https://ideone.com/DuHTed. This demos both the original case in the question and the asymmetric lists.
Upvotes: 3
Reputation: 4606
You can use iter
and then compare indices
res = []
for i in lstA:
a = iter(lstB)
while True:
try:
b = next(a)
if lstA.index(i) < lstA.index(b) and lstB.index(i) > lstB.index(b):
res.append((i, b))
except StopIteration:
break
print(res)
# [('Harry Potter', '50 Shades'), ('Harry Potter', 'Dracula'), ('Harry Potter', '1984'), ('1984', '50 Shades'), ('1984', 'Dracula')]
Upvotes: 0
Reputation: 12990
An efficient version of @jpp's solution is as follows:
from itertools import combinations
lstA = ['Harry Potter','1984','50 Shades','Dracula']
lstB = ['50 Shades','Dracula','1984','Harry Potter']
bIndices = {b: i for i, b in enumerate(lstB)}
aPairs = [sorted(c) for c in combinations(enumerate(lstA), 2)]
mismatches = [(book1[1], book2[1]) for book1, book2 in aPairs if bIndices[book1[1]] > bIndices[book2[1]]]
print(mismatches)
# [('Harry Potter', '1984'), ('Harry Potter', '50 Shades'), ('Harry Potter', 'Dracula'), ('1984', '50 Shades'), ('1984', 'Dracula')]
Note that aPairs
are combinations of (index, book) tuples and each combination is sorted by the index which guarantees that in each pair of books, the first is "better" than the next (for user A).
Now to compute ordering mismatches, we just need to decide whether the corresponding book indices in lstB
also preserve this ordering.
As @MadPhysicist noted, combinations
preserves the original order in the array in each generated tuple, so no need to create aPairs
as a list of sorted (index, book)
tuples. We can directly generate mismatches
with just bIndices
:
lstA = ['Harry Potter','1984','50 Shades','Dracula']
lstB = ['50 Shades','Dracula','1984','Harry Potter']
bIndices = {b: i for i, b in enumerate(lstB)}
mismatches = [(book1, book2) for book1, book2 in combinations(lstA, 2) if bIndices[book1] > bIndices[book2]]
Upvotes: 2
Reputation: 164623
First formulate your logic mathematically. For all combinations of length 2, given indices idx_a1, idx_a2
and idx_b1, idx_b2
, if sign(idx_a1 - idx_a2) != sign(idx_b1 - idx_b2)
, record the combination.
The below isn't efficient, but it shows one way of transforming this logic to code:
from itertools import combinations
lstA = ['Harry Potter','1984','50 Shades','Dracula']
lstB = ['50 Shades','Dracula','1984','Harry Potter']
def sign(x):
"""Return +1 if integer is positive, -1 if negative"""
return (x > 0) - (x < 0)
res = []
for a, b in combinations(lstA, 2):
idx_a1, idx_a2 = lstA.index(a), lstA.index(b)
idx_b1, idx_b2 = lstB.index(a), lstB.index(b)
if sign(idx_a1 - idx_a2) != sign(idx_b1 - idx_b2):
res.append((a, b))
[('Harry Potter', '1984'),
('Harry Potter', '50 Shades'),
('Harry Potter', 'Dracula'),
('1984', '50 Shades'),
('1984', 'Dracula')]
Upvotes: 4