Rei
Rei

Reputation: 329

Change formatted file per record

I'm looking to change a formatted file that looks like this:

1182659 Sample05 22
1182659 Sample33 14
4758741 Sample05 74
4758741 Sample33 2
3652147 Sample05 8
3652147 Sample33 34

To this:

       Sample05 Sample33 
1182659 22 14
4758741 74 2
3652147 8 34

One way I see to do it, is to use a double indexed dictionary but I would like to know if there is an easier way to do this before I go in.

Upvotes: 0

Views: 24

Answers (2)

Andrej Kesely
Andrej Kesely

Reputation: 195428

Without pandas, but with groupby from itertools:

from itertools import groupby

data = """
1182659 Sample05 22
1182659 Sample33 14
4758741 Sample05 74
4758741 Sample33 2
3652147 Sample05 8
3652147 Sample33 34
"""

groups = groupby((line.split() for line in data.splitlines() if line), key=lambda v: v[0])

rows = []
headers = []
for g, v in groups:
    v = list(v)
    for i in v:
        if i[1] not in headers:
            headers.append(i[1])
    rows.append([g] + [i[-1] for i in v])

print('\t'+ '\t'.join(headers))
for row in rows:
    for value in row:
        print(value, end='\t')
    print()

Prints:

    Sample05    Sample33
1182659 22  14  
4758741 74  2   
3652147 8   34  

Upvotes: 1

sundance
sundance

Reputation: 2945

Use pandas:

import pandas as pd

# if the delimeter is a space
df = pd.read_csv("<path to file>.txt", sep=" ", header=None)
df.set_index([0, 1])[2].unstack()

Output:

1        Sample05  Sample33
0
1182659        22        14
3652147         8        34
4758741        74         2

Upvotes: 0

Related Questions