Reputation: 2214
I am using a software which outputs only the upper triangle of a symmetric matrix in the following format:
2 3 4 5 6 7 8
1: -0.00 0.09 0.03 -0.27 -0.28 0.83 -0.31
2: 0.09 0.03 -0.26 -0.28 0.83 -0.31
3: 0.00 0.11 0.11 0.33 0.10
4: 0.03 0.03 -0.00 0.03
5: -0.02 0.91 -0.04
6: 0.92 -0.03
7: 0.91
I would like to plot this matrix in a heatmap. However, I have a problem in reading this
text file into a data structure. How could I turn this text file into a for example, numpy
array which I could use as a matrix for plotting?
Thank you!
Upvotes: 1
Views: 288
Reputation: 2214
I could come up with the following solution:
t = open("test_fit")
long_l = []
for line in t:
line = line.rstrip().split()
long_l.append(line[1:])
long_l_new = long_l[1:]
print(long_l_new)
for index, item in enumerate(long_l_new):
print(index, item)
item.insert(0, '0')
long_l_new.append(['0'])
mat = []
for index, item in enumerate(long_l_new):
if index == 0:
to_insert = long_l_new[index][index + 1]
new_l = long_l_new[index + 1]
new_l_to_add = new_l.insert(index, to_insert)
else:
if index < len(long_l_new) - 1:
for i in range(0, index+1):
to_insert = long_l_new[i][index + 1]
new_l = long_l_new[index + 1]
new_l.insert(i, to_insert)
Output:
[['0', '-0.00', '0.09', '0.03', '-0.27', '-0.28', '0.83', '-0.31'],
['-0.00', '0', '0.09', '0.03', '-0.26', '-0.28', '0.83', '-0.31'],
['0.09', '0.09', '0', '0.00', '0.11', '0.11', '0.33', '0.10'],
['0.03', '0.03', '0.00', '0', '0.03', '0.03', '-0.00', '0.03'],
['-0.27', '-0.26', '0.11', '0.03', '0', '-0.02', '0.91', '-0.04'],
['-0.28', '-0.28', '0.11', '0.03', '-0.02', '0', '0.92', '-0.03'],
['0.83', '0.83', '0.33', '-0.00', '0.91', '0.92', '0', '0.91'],
['-0.31', '-0.31', '0.10', '0.03', '-0.04', '-0.03', '0.91', '0']]
Upvotes: 0
Reputation: 46888
If I read in your text file correctly, you can read in the file using pandas with space delimiter:
import pandas as pd
import numpy as np
dat = pd.read_csv("test.txt",index_col=0,delimiter='\s+').to_numpy()
Looks like this:
array([[-0. , 0.09, 0.03, -0.27, -0.28, 0.83, -0.31],
[ 0.09, 0.03, -0.26, -0.28, 0.83, -0.31, nan],
[ 0. , 0.11, 0.11, 0.33, 0.1 , nan, nan],
[ 0.03, 0.03, -0. , 0.03, nan, nan, nan],
[-0.02, 0.91, -0.04, nan, nan, nan, nan],
[ 0.92, -0.03, nan, nan, nan, nan, nan],
[ 0.91, nan, nan, nan, nan, nan, nan]])
So we just need to invert the nan:
idx = np.arange(dat.shape[1])
arr = np.empty(dat.shape)
for i in range(dat.shape[1]):
arr[i] = dat[i][np.concatenate([idx[-i:],idx[:-i]])]
And the end result looks like this:
arr
array([[-0. , 0.09, 0.03, -0.27, -0.28, 0.83, -0.31],
[ nan, 0.09, 0.03, -0.26, -0.28, 0.83, -0.31],
[ nan, nan, 0. , 0.11, 0.11, 0.33, 0.1 ],
[ nan, nan, nan, 0.03, 0.03, -0. , 0.03],
[ nan, nan, nan, nan, -0.02, 0.91, -0.04],
[ nan, nan, nan, nan, nan, 0.92, -0.03],
[ nan, nan, nan, nan, nan, nan, 0.91]])
Upvotes: 1