Harper_C
Harper_C

Reputation: 63

append columns of data

I have tab delimited data that I am exporting a select few columns into another file. I have:

a b c d
1 2 3 4
5 6 7 8 
9 10 11 12

and I get:

b, d
b, d
2, 4
b, d
2, 4
6, 8
b, d
2, 4
6, 8
10, 12
......

I want:

b, d
2, 4
6, 8 
10, 12

My code is

f=open('data.txt', 'r')
f1=open('newdata.txt','w')
t=[]
for line in f.readlines():
    line =line.split('\t')
    t.append('%s,%s\n' %(line[0], line[3]))
    f1.writelines(t)

What am I doing wrong??? Why is it repeating?

PLease help

Thanks!!

Upvotes: 1

Views: 2946

Answers (2)

John Machin
John Machin

Reputation: 82934

As already mentioned, the last line is incorrectly indented. On top of that, you are making things hard and error prone. You don't need the t list, and you don't need to use f.readlines().

Another problem with your code is that your line[3] will end with a newline (because readlines() and friends leave the newline at the end of the line), and you are adding another newline in the format '%s,%s\n' ... this would have produced double spacing on your output file, but you haven't mentioned that.

Also you say that you want b, d in the first output line, and you say that you get b, d -- however your code says '%s,%s\n' %(line[0], line[3]) which would produce a,d. Note TWO differences: (1) space missing (2) a instead of b.

Overall: you say that you get b, d\n but the code that you show would produce a,d\n\n. In future, please show code and output that correspond with each other. Use copy/paste; don't type from memory.

Try this:

f = open('data.txt', 'r')
f1 = open('newdata.txt','w')
for line in f: # reading one line at a time
    fields = line.rstrip('\n').split('\t')
    # ... using rstrip to remove the newline.
    # Re-using the name `line` as you did makes your script less clear.
    f1.write('%s,%s\n' % (fields[0], fields[3]))
    # Change the above line as needed to make it agree with your desired output.
f.close()
f1.close()
# Always close files when you have finished with them,
# especially files that you have written to.

Upvotes: 1

Mark Byers
Mark Byers

Reputation: 838336

The indentation is wrong so you are writing the entire array t on every iteration instead of only at the end. Change it to this:

t=[]
for line in f.readlines():
    line = line.split('\t')
    t.append('%s,%s\n' % (line[0], line[3]))
f1.writelines(t)

Alternatively you could write the lines one at a time instead of waiting until the end, then you don't need the array t at all.

for line in f.readlines():
    line = line.split('\t')
    s = '%s,%s\n' % (line[0], line[3])
    f1.write(s)

Upvotes: 4

Related Questions