Reputation: 3
Complete newby here. I'm trying to parse a simulation output with Python into a frame and write it into an excel sheet.
The output looks like this:
F100.T,557.9567856878748,F,F,F,F,
F100.Tv,557.9567856878748,F,F,F,F,
F100.Tl,557.9567856878748,F,F,F,F,
F100.Duty,-106382.60618934222,F,F,F,T,1
...
F200.T,557.9567856878748,F,F,F,F,
F200.Tv,557.9567856878748,F,F,F,F,
F200.Tl,557.9567856878748,F,F,F,F,
F200.Duty,-37798.28473117316,F,F,F,T,1
... and so on
How it should look like at the end:
| F100 | F200 |
| -------------------|----------------|
T | 557.9567856878748 | 100 |
Tv | 557.9567856878748 | 5.550847203 |
T1 |-106382.60618934222 | 3.798721561 |
... and so on
What I've tried so far:
import itertools
import pandas as pd
def read_lines(file_object) -> list:
return [
parse_line(line) for line in file_object.readlines() if line.strip()
]
def parse_line(line: str) -> list:
return [
i.split(",")[1]
for i in line.strip().split()
if i.startswith(("F100", "F200"))
]
def flatten(parsed_lines: list) -> list:
return list(itertools.chain.from_iterable(parsed_lines))
def cut_into_pieces(flattened_lines: list, piece_size: int = 2) -> list:
return [
flattened_lines[i:i + piece_size] for i
in range(0, len(flattened_lines), piece_size)
]
with open("sim.txt") as data:
df = pd.DataFrame(
cut_into_pieces(flatten(read_lines(data))),
columns=["F100", "F200"],
)
print(df)
df.to_excel("table.xlsx", index=False)
So far, the result looks like this:
F100 F200
1 557.9567856878748 100
2 557.9567856878748 5.550847203
3 -106382.60618934222 3.798721561
.. ... ...
As you see, the rows are not named (T, Tv, T1 etc.). Im hitting a wall and don't know how to continue from here.
Thanks in advance
Upvotes: 0
Views: 159
Reputation: 6134
import pandas as pd
from io import StringIO
txt = """\
F100.T,557.9567856878748,F,F,F,F,
F100.Tv,557.9567856878748,F,F,F,F,
F100.Tl,557.9567856878748,F,F,F,F,
F100.Duty,-106382.60618934222,F,F,F,T
F200.T,557.9567856878748,F,F,F,F,
F200.Tv,557.9567856878748,F,F,F,F,
F200.Tl,557.9567856878748,F,F,F,F,
F200.Duty,-37798.28473117316,F,F,F,T
"""
df = pd.read_csv(StringIO(txt),header=None)
# create two new columns from the first column
df[['a','b']] = df[0].str.split('.',expand=True)
df.pivot('b','a',1)
# note: if index/column values are not unique
# then you can aggregate them like this:
# df.pivot_table(index='b',columns='a',values=1, aggfunc='mean')
F100 | F200 | |
---|---|---|
Duty | -106383 | -37798.3 |
T | 557.957 | 557.957 |
Tl | 557.957 | 557.957 |
Tv | 557.957 | 557.957 |
Upvotes: 1