Reputation: 157
I have a text file which looks like the small example:
small example:
Name sample1 sample2 sample3
A2M 9805.6 3646.8 1376.48
ACVR1C 20 37.8 20
ADAM12 197.8 120.96 31.28
I am trying to re-organize the data and make a new text file which looks like the expected output:
expected output:
Name Sample
A2M 9805.6
A2M 3646.8
A2M 1376.48
ACVR1C 20
ACVR1C 37.8
ACVR1C 20
ADAM12 197.8
ADAM12 120.96
ADAM12 31.28
in fact the last 3 columns (of input data) will be included in the 2nd column of output data and every item in the 1st column of input file will be repeated 3 times (there are 3 samples per Name).
to do so, I wrote the following code in python3:
def convert(input_file, output_file):
with open(input_file, 'r') as infile:
res = {}
line = infile.split()
res.keys = line[0]
res.values = line[2:]
outfile = open(output_file, "w")
for k, v in res.items():
outfile.write(str(k) + '\t'+ str(v) + '\n')
but it does not return what I want to get. do you know how to fix it?
Upvotes: 0
Views: 140
Reputation: 5785
Try this,
d= {}
with open('file1.txt','r') as f: # Your file
header = next(f)
for i in f:
d.setdefault(i.split()[0],[]).extend(i.split()[1:])
with open('nflie1.txt','w') as f: # New file
f.write('Name Sample\n')
for k,v in d.items():
for el in v:
f.write('{} {}\n'.format(k,el))
Output:
Name Sample
A2M 9805.6
A2M 3646.8
A2M 1376.48
ACVR1C 20
ACVR1C 37.8
ACVR1C 20
ADAM12 197.8
ADAM12 120.96
ADAM12 31.28
Upvotes: 1
Reputation: 19414
You have a few problems in your code.
First you should also open the outfile
within the with
statement. Second, a dict's keys
and values
are read only. And last you try to split the whole file which is not possible. You want to loop on all the lines like so:
def convert(input_file, output_file):
with open(input_file) as infile, open(output_file, "w") as outfile:
outfile.write("Name\tSample")
for line in infile:
values = line.split()
for value in values[1:]:
outfile.write(values[0] + "\t" + value + "\n")
Although you should consider changing your format to csv
and reading it to a dataframe.
Upvotes: 1