Homap
Homap

Reputation: 2214

read a matrix from a text file into numpy

I am using a software which outputs only the upper triangle of a symmetric matrix in the following format:

         2       3       4       5       6       7       8     
1:   -0.00    0.09    0.03   -0.27   -0.28    0.83   -0.31  
2:            0.09    0.03   -0.26   -0.28    0.83   -0.31
3:                    0.00    0.11    0.11    0.33    0.10 
4:                            0.03    0.03   -0.00    0.03 
5:                                   -0.02    0.91   -0.04 
6:                                            0.92   -0.03 
7:                                                    0.91 

I would like to plot this matrix in a heatmap. However, I have a problem in reading this text file into a data structure. How could I turn this text file into a for example, numpy array which I could use as a matrix for plotting?

Thank you!

Upvotes: 1

Views: 288

Answers (2)

Homap
Homap

Reputation: 2214

I could come up with the following solution:

t = open("test_fit")

long_l = []
for line in t:                                           
    line = line.rstrip().split()                          
    long_l.append(line[1:]) 

long_l_new = long_l[1:]
print(long_l_new)

for index, item in enumerate(long_l_new):
    print(index, item)
    item.insert(0, '0')

long_l_new.append(['0'])

mat = []
for index, item in enumerate(long_l_new):
    if index == 0:
        to_insert = long_l_new[index][index + 1]
        new_l = long_l_new[index + 1]
        new_l_to_add = new_l.insert(index, to_insert)
    else:
        if index < len(long_l_new) - 1:
            for i in range(0, index+1):
                to_insert = long_l_new[i][index + 1]
                new_l = long_l_new[index + 1]
                new_l.insert(i, to_insert)

Output:

[['0', '-0.00', '0.09', '0.03', '-0.27', '-0.28', '0.83', '-0.31'],
 ['-0.00', '0', '0.09', '0.03', '-0.26', '-0.28', '0.83', '-0.31'],
 ['0.09', '0.09', '0', '0.00', '0.11', '0.11', '0.33', '0.10'],
 ['0.03', '0.03', '0.00', '0', '0.03', '0.03', '-0.00', '0.03'],
 ['-0.27', '-0.26', '0.11', '0.03', '0', '-0.02', '0.91', '-0.04'],
 ['-0.28', '-0.28', '0.11', '0.03', '-0.02', '0', '0.92', '-0.03'],
 ['0.83', '0.83', '0.33', '-0.00', '0.91', '0.92', '0', '0.91'],
 ['-0.31', '-0.31', '0.10', '0.03', '-0.04', '-0.03', '0.91', '0']]

Upvotes: 0

StupidWolf
StupidWolf

Reputation: 46888

If I read in your text file correctly, you can read in the file using pandas with space delimiter:

import pandas as pd
import numpy as np
dat = pd.read_csv("test.txt",index_col=0,delimiter='\s+').to_numpy()

Looks like this:

array([[-0.  ,  0.09,  0.03, -0.27, -0.28,  0.83, -0.31],
       [ 0.09,  0.03, -0.26, -0.28,  0.83, -0.31,   nan],
       [ 0.  ,  0.11,  0.11,  0.33,  0.1 ,   nan,   nan],
       [ 0.03,  0.03, -0.  ,  0.03,   nan,   nan,   nan],
       [-0.02,  0.91, -0.04,   nan,   nan,   nan,   nan],
       [ 0.92, -0.03,   nan,   nan,   nan,   nan,   nan],
       [ 0.91,   nan,   nan,   nan,   nan,   nan,   nan]])

So we just need to invert the nan:

idx = np.arange(dat.shape[1])
arr = np.empty(dat.shape)
for i in range(dat.shape[1]):
    arr[i] = dat[i][np.concatenate([idx[-i:],idx[:-i]])]

And the end result looks like this:

arr

array([[-0.  ,  0.09,  0.03, -0.27, -0.28,  0.83, -0.31],
       [  nan,  0.09,  0.03, -0.26, -0.28,  0.83, -0.31],
       [  nan,   nan,  0.  ,  0.11,  0.11,  0.33,  0.1 ],
       [  nan,   nan,   nan,  0.03,  0.03, -0.  ,  0.03],
       [  nan,   nan,   nan,   nan, -0.02,  0.91, -0.04],
       [  nan,   nan,   nan,   nan,   nan,  0.92, -0.03],
       [  nan,   nan,   nan,   nan,   nan,   nan,  0.91]])

Upvotes: 1

Related Questions