Parsing output (text) into dataframe with columns and rows

Question

Complete newby here. I'm trying to parse a simulation output with Python into a frame and write it into an excel sheet.

The output looks like this:

F100.T,557.9567856878748,F,F,F,F,
F100.Tv,557.9567856878748,F,F,F,F,
F100.Tl,557.9567856878748,F,F,F,F,
F100.Duty,-106382.60618934222,F,F,F,T,1
... 
F200.T,557.9567856878748,F,F,F,F,
F200.Tv,557.9567856878748,F,F,F,F,
F200.Tl,557.9567856878748,F,F,F,F,
F200.Duty,-37798.28473117316,F,F,F,T,1
... and so on

How it should look like at the end:

    | F100               | F200           |
    | -------------------|----------------| 
 T  | 557.9567856878748  | 100            |
 Tv | 557.9567856878748  | 5.550847203    |
 T1 |-106382.60618934222 | 3.798721561    |

... and so on

What I've tried so far:

import itertools
import pandas as pd


def read_lines(file_object) -> list:
    return [
        parse_line(line) for line in file_object.readlines() if line.strip()
    ]


def parse_line(line: str) -> list:
    return [
        i.split(",")[1]
        for i in line.strip().split()
        if i.startswith(("F100", "F200"))
    ]


def flatten(parsed_lines: list) -> list:
    return list(itertools.chain.from_iterable(parsed_lines))


def cut_into_pieces(flattened_lines: list, piece_size: int = 2) -> list:
    return [
        flattened_lines[i:i + piece_size] for i
        in range(0, len(flattened_lines), piece_size)
    ]


with open("sim.txt") as data:
    df = pd.DataFrame(
        cut_into_pieces(flatten(read_lines(data))),
        columns=["F100", "F200"],
    )
    print(df)
    df.to_excel("table.xlsx", index=False)

So far, the result looks like this:

                    F100                 F200
1      557.9567856878748    100
2      557.9567856878748    5.550847203
3    -106382.60618934222    3.798721561
..                   ...                  ...

As you see, the rows are not named (T, Tv, T1 etc.). Im hitting a wall and don't know how to continue from here.
Thanks in advance

	F100	F200
Duty	-106383	-37798.3
T	557.957	557.957
Tl	557.957	557.957
Tv	557.957	557.957

Parsing output (text) into dataframe with columns and rows

Answers (1)

Related Questions