user136819
user136819

Reputation: 239

Python: Edge List to an Adjacency Matrix using SciPy/Pandas shows IndexError: column index (3) out of bounds

I have a text file with an Edge List (egde.txt):

1 1 0.00000000000000000000
1 2 0.25790529076045041
1 3 0.77510411846367422
2 1 0.34610027855153203
2 2 0.00000000000000000000
2 3 0.43889275766016713
3 1 0.75335810231494713
3 2 0.22234924264075450
3 3 0.00000000000000000000  

The weights of the edges are floating values as seen and the separators are white spaces which I must keep that way in the text file. I want to convert this edge list to a Matrix like the following and store it in a CSV file:

    1         2         3
1   0.000000  0.257905  0.775104
2   0.346100  0.000000  0.438893
3   0.753358  0.222349  0.000000  

I have the following code (txttocsv2.py) which I thought would work, but unfortunately does not:

import numpy as np
import scipy.sparse as sps
import csv
import pandas as pd

with open('connectivity.txt', 'r') as fil:

    A = np.genfromtxt(fil)

    i, j, weight = A[:,0], A[:,1], A[:,2]

    dim =  max(len(set(i)), len(set(j)))

    B = sps.lil_matrix((dim, dim))
    for i,j,w in zip(i,j,weight):
        B[i,j] = w

    for row in B: #I want to print the output as well to see if it works
        print(row)

    with open("connect.csv", "wb") as f:
        for row in B:
            writer = csv.writer(f)
            writer.writerow(B)  

The error is:

Traceback (most recent call last):
  File "txttocsv2.py", line 16, in <module>
    B[i,j] = w
  File "/home/osboxes/pymote_env/local/lib/python2.7/site-packages/scipy/sparse/lil.py", line 379, in __setitem__
    i, j, x)
  File "scipy/sparse/_csparsetools.pyx", line 231, in scipy.sparse._csparsetools.lil_fancy_set (scipy/sparse/_csparsetools.c:5041)
  File "scipy/sparse/_csparsetools.pyx", line 376, in scipy.sparse._csparsetools._lil_fancy_set_int32_float64 (scipy/sparse/_csparsetools.c:7021)
  File "scipy/sparse/_csparsetools.pyx", line 87, in scipy.sparse._csparsetools.lil_insert (scipy/sparse/_csparsetools.c:3216)
IndexError: column index (3) out of bounds  

Could anybody point out where the code is failing and help me out?
Thanks in advance :)
Using Ubuntu 14.04 32-bit VM and Python 2.7

Upvotes: 2

Views: 1298

Answers (1)

Jonathan H.
Jonathan H.

Reputation: 56

Your code tries to access location i,j in matrix B. The problem is i and j are one-based and the matrix is zero-based. You should switch to B[i-1,j-1] = w. Also, you probably need to change the row writer.writerow(B) to writer.writerow(row).

Or like John Galt said, use pandas pivot:

import pandas as pd

pd.read_csv('edge.txt', delimiter=' ', header=None).pivot(0,1,2).to_csv('connect.csv', header=False, index=False)

Upvotes: 2

Related Questions