Reputation: 1252
I think this is probably something people have already solved, and may even be some baked in functionality I'm missing, so I figured I'd ask before I reinvent the wheel.
Basically, given some pairwise output from itertools.combinations
, I'd like to represent it as a matrix/table of each comparison.
So far, I'm roughly up to this point:
from itertools import combinations
def chunks(l, n):
n = max(1, n)
return [l[i:i+n] for i in range(0, len(l), n)]
x = [("A", 1), ("B", 2), ("C", 3), ("D", 4), ("E", 5)]
[print(i) for i in chunks([i[1]+j[1] for i, j in combinations(x, 2)], len(x)-1)]
This gives me a matrix-style output:
[3, 4, 5, 6]
[5, 6, 7, 7]
[8, 9]
[None, None, None]
I'm not sure where the None
s are coming from just yet as the output of chunks([i[1]+j[1] for i, j in combinations(x, 2)], len(x)-1)
is:
[[3, 4, 5, 6], [5, 6, 7, 7], [8, 9]]
But I can look into that later on (but feel free to point out my obvious mistake!)
I'd ideally like to end up with a pairwise matrix (ideally with the names of the comparisons attached too so it would appear something like:
A B C D E
A 3 4 5 6
B 5 6 7
C 7 8
D 9
E
It clear my naive approach of chunk
ing by the length of the input data isn't quite right either as the 7
belonging to the C
+D
comparison is on the wrong line. I'd forgotten to account for the additional entry disappearing each time.
If there's a better way altogether, I'm happy to change the approach. I've focussed on using itertools
for this as it may end up being run over large files with potentially thousands of comparisons in a bigger script with other calculations etc happening, so avoiding self, and repeat comparisons is ideal.
To add, I'd like to subsequently be able to output the matrix that I depicted, with the row and column names, to a tsv/csv or similar.
Upvotes: 0
Views: 237
Reputation: 49896
This creates a matrix as you describe, using 0's for the "blanks":
[[(a[1]+b[1] if a[0]<b[0] else 0) for b in x] for a in x]
To print it out:
print("\t".join(['']+[a[0] for a in x]))
for a in x:
print("\t".join([a[0]] + [(str(a[1]+b[1]) if a[0]<b[0] else '') for b in x]))
Upvotes: 1