Reputation: 409
I wrote a script to reformat a tab-delimited matrix (with header) into a "long format". See example below. It performs the task correctly but it seems to get stuck in an endless loop...
Example of input:
WHO THING1 THING2
me me1 me2
you you1 you2
Desired output:
me THING1 me1
me THING2 me2
you THING1 you1
you THING2 you2
Here is the code:
import csv
matrix_file = open('path')
matrix_reader = csv.reader(matrix_file, delimiter="\t")
j = 1
while j:
matrix_file.seek(0)
rownum = 0
for i in matrix_reader:
rownum+=1
if j == int(len(i)):
j = False
elif rownum ==1:
header = i[j]
else:
print i[0], "\t",header, "\t",i[j]
j +=1
I think it has to do with my exit command (j = False). Any ideas?
edit: Thanks for suggestions. I think a typo in my initial posting led to some confusion, sorry about that For now I have employed a simple solution:
valid = True
while valid:
matrix_file.seek(0)
rownum = 0
for i in matrix_reader:
rownum+=1
if j == int(len(i)):
valid = False
etc, etc, etc...
Upvotes: 0
Views: 137
Reputation: 151047
Your j += 1
is outside the while
loop, so j
never increases. If len(i)
is never less than 2, then you'll have an infinite loop.
But as has been observed, there are other problems with this code. Here's a working version based on your idiom. I would do a lot of things differently, but perhaps you'll find it useful to see how your code could have worked:
j = 1
while j:
matrix_file.seek(0)
rownum = 0
for i in matrix_reader:
rownum += 1
if j == len(i) or j == -1:
j = -1
elif rownum == 1:
header = i[j]
else:
print i[0], "\t", header, "\t", i[j]
j += 1
It doesn't print the rows in the order you wanted, but it gets the basics right.
Here's how I would do it instead. I see that this is similar to what Ashwini Chaudhary posted, but a bit more generalized:
import csv
matrix_file = open('path')
matrix_reader = csv.reader(matrix_file, delimiter="\t")
headers = next(matrix_reader, '')
for row in matrix_reader:
for header, value in zip(headers[1:], row[1:]):
print row[0], header, value
Upvotes: 4
Reputation: 251011
j+=1
is outside the while loop as senderle's answer says.
other improvements can be:
int(len(i))
,just use len(i)
,as len()
always returns a int so no need of int()
around
itfor rownum,i in enumerate(matrix_reader):
so now there's no
need of handling an extra variable rownum
, it'll be incremented by
itself.EDIT: A working version of your code, I don't think there's a need of while
here, the for
loop is sufficient.
import csv
matrix_file = open('data1.csv')
matrix_reader = csv.reader(matrix_file, delimiter="\t")
header=matrix_reader.next()[0].split() #now header is ['WHO', 'THING1', 'THING2']
for i in matrix_reader:
line=i[0].split()
print line[0], "\t",header[1], "\t",line[1]
print line[0], "\t",header[2], "\t",line[2]
Upvotes: 3