user10657934
user10657934

Reputation: 157

Re-organizing the data in a text file in python3

I have a text file which looks like the small example:

small example:

Name    sample1 sample2 sample3
A2M 9805.6  3646.8  1376.48
ACVR1C  20  37.8    20
ADAM12  197.8   120.96  31.28

I am trying to re-organize the data and make a new text file which looks like the expected output:

expected output:

Name    Sample
A2M 9805.6
A2M 3646.8
A2M 1376.48
ACVR1C  20
ACVR1C  37.8
ACVR1C  20
ADAM12  197.8
ADAM12  120.96
ADAM12  31.28

in fact the last 3 columns (of input data) will be included in the 2nd column of output data and every item in the 1st column of input file will be repeated 3 times (there are 3 samples per Name).

to do so, I wrote the following code in python3:

def convert(input_file, output_file):
    with open(input_file, 'r') as infile:
        res = {}
        line = infile.split()
        res.keys = line[0]
        res.values = line[2:]
        outfile = open(output_file, "w")
        for k, v in res.items():
            outfile.write(str(k) + '\t'+ str(v) + '\n')

but it does not return what I want to get. do you know how to fix it?

Upvotes: 0

Views: 140

Answers (2)

shaik moeed
shaik moeed

Reputation: 5785

Try this,

d= {}
with open('file1.txt','r') as f: # Your file
    header = next(f)
    for i in f:
        d.setdefault(i.split()[0],[]).extend(i.split()[1:])

with open('nflie1.txt','w') as f: # New file
    f.write('Name Sample\n')
    for k,v in d.items():
        for el in v:
            f.write('{} {}\n'.format(k,el))

Output:

Name Sample
A2M 9805.6
A2M 3646.8
A2M 1376.48
ACVR1C 20
ACVR1C 37.8
ACVR1C 20
ADAM12 197.8
ADAM12 120.96
ADAM12 31.28

Upvotes: 1

Tomerikoo
Tomerikoo

Reputation: 19414

You have a few problems in your code.

First you should also open the outfile within the with statement. Second, a dict's keys and values are read only. And last you try to split the whole file which is not possible. You want to loop on all the lines like so:

def convert(input_file, output_file):
    with open(input_file) as infile, open(output_file, "w") as outfile:
        outfile.write("Name\tSample")
        for line in infile:
            values = line.split()
            for value in values[1:]:
                outfile.write(values[0] + "\t" + value + "\n")

Although you should consider changing your format to csv and reading it to a dataframe.

Upvotes: 1

Related Questions