Pazu
Pazu

Reputation: 287

Trying to convert txt.file with edges into edgelist

I have a txt. -file which is of this format:

0   61
0   33344
0   33412
0   36114
0   37320
0   37769
0   37924

This is in fact a list of edges for a network which I want to convert into the following

elist = [(0,61), (0,33344), (0,33412), (0,36114), (0,37320), (0,37769), (0,37924)]

My idea was the following:

import csv

data = open("path_to_file.txt", 'r')
reader = csv.reader(data)
allRows = [tuple(row) for row in reader]

The problem is that I receive this:

[('0\t61',), ('0\t33344',), ('0\t33412',), ('0\t36114',), ('0\t37320',), ('0\t37769',), ('0\t37924',)]

How can we fix this?

Upvotes: 1

Views: 1233

Answers (2)

Daniel R. Livingston
Daniel R. Livingston

Reputation: 1229

The other posters had mentioned that you can just use the escape code \t as a delimiter in csv.reader() to parse out the extra rows. This is true, but it appears from your file that your delimiter is actually four spaces instead of a single tab. So, this will not work.

If you print out [row for row in reader], you find that the separation between adjacent elements is not being preserved:

 [['0   61'],
 ['0   33344'],
 ['0   33412'],
 ['0   36114'],
 ['0   37320'],
 ['0   37769'],
 ['0   37924']]

Therefore, efforts made to turn this into a tuple will fail, as there is only one str element per row. So then, each row in reader is a single element list. Calling row[0] will give you the actual string value: '0 61'.

We then use .split() to create two elements from this string:

In [47]: '0   61'.split()
Out[47]: ['0', '61']

Now, we can use map to create integers from these two new strings:

In [49]: map(int,'0   61'.split())
Out[49]: [0, 61]

Then, we convert to a tuple and append to a list, like you did above, and we have a working solution.

data = open("path_to_file.txt", 'r')
reader = csv.reader(data)
allRows = [tuple(map(int,row[0].split())) for row in reader]

In [43]: allRows
Out[43]:
[(0, 61),
 (0, 33344),
 (0, 33412),
 (0, 36114),
 (0, 37320),
 (0, 37769),
 (0, 37924)]

Upvotes: 1

jeremysprofile
jeremysprofile

Reputation: 11464

import csv

data = open("path_to_file.txt", 'r')
reader = csv.reader(data)
allRows = [tuple(row.split('\t')) for row in reader]

You were close. And there's almost certainly another way to have csv split on tabs instead of commas if you'd rather go that route.

EDIT: as @roganjosh said, you could just do

import csv

data = open("path_to_file.txt", 'r')
reader = csv.reader(data, delimiter='\t'))
allRows = [tuple(row) for row in reader]

Upvotes: 2

Related Questions