Reputation: 8536
When I program I often use external software to do the heavy computations, but then analysis the results in Python. These external software is often Fortran, C or C++, which works by giving them input file(s). This can either be a small file telling which mode to perform certain calculations, or a large data file it has to process. These files often use a certain format (so and so many spaces between data columns). An e.g. is given below for a data file I currently use.
This is a header. The first line is always a header...
7352.103 26.0 2.61 -8.397 11.2
7353.510 26.0 4.73 -1.570 3.5
7356.643 26.0 5.75 -2.964 9.0
7356.648 26.0 5.35 -3.187 9.0
7364.034 26.0 5.67 -5.508 1.7
7382.523 26.0 5.61 -3.935 1.9
My question is if there exist a Python library to create such input files, from reading a template (given by a coworker or from documentation of the external software)?
Usually I have all the columns in a NumPy
format and want to give it to a function that creates an input file, using the template as an example. I'm not looking for a brute force method, which can get ugly very quickly.
I am not sure what to search for here, and any help is appreciated.
Upvotes: 4
Views: 1177
Reputation: 231425
I can basically replicate your sample with savetxt
. Its fmt
variable gives me the same sort of formatting control that FORTRAN code uses for reading and writing files. It preserves spaces in the same way that FORTRAN and C print does.
import numpy as np
example = """
This is a header. The first line is always a header...
7352.103 26.0 2.61 -8.397 11.2
...
"""
lines = example.split('\n')[1:]
header = lines[0]
data = []
for line in lines[1:]:
if len(line):
data.append([float(x) for x in line.split()])
data = np.array(data)
fmt = '%10.3f %9.1f %9.2f %9.3f %20.1f' # similar to a FORTRAN format statment
filename = 'stack21865757.txt'
with open(filename,'w') as f:
np.savetxt(f, data, fmt, header=header)
with open(filename) as f:
print f.read()
producing:
# This is a header. The first line is always a header...
7352.103 26.0 2.61 -8.397 11.2
7353.510 26.0 4.73 -1.570 3.5
...
EDIT
Here's a crude script that converts an example line into a format:
import re
tmplt = ' 7352.103 26.0 2.61 -8.397 11.2'
def fmt_from_template(tmplt):
pat = r'( *-?\d+\.(\d+))' # one number with its decimal
fmt = []
while tmplt:
match = re.search(pat,tmplt)
if match:
x = len(match.group(1)) # length of the whole number
d = len(match.group(2)) # length of decimals
fmt += ['%%%d.%df'%(x,d)]
tmplt = tmplt[x:]
fmt = ''.join(fmt)
return fmt
print fmt_from_template(tmplt)
# %10.3f%10.1f%10.2f%10.3f%29.1f
Upvotes: 5
Reputation: 11205
adapating hpaulj andwer to magically extract the fmt of savetxt
from __future__ import print_function
import numpy as np
import re
example = """
This is a header. The first line is always a header...
7352.103 26.0 2.61 -8.397 11.2
7353.510 26.0 4.73 -1.570 3.5
7356.643 26.0 5.75 -2.964 9.0
7356.648 26.0 5.35 -3.187 9.0
7364.034 26.0 5.67 -5.508 1.7
7382.523 26.0 5.61 -3.935 1.9
"""
def extract_format(line):
def iter():
for match in re.finditer(r"\s+-?\d+\.(\d+)",line):
yield "%{}.{}f".format(len(match.group(0)),len(match.group(1)))
return "".join(iter())
lines = example.split('\n')[1:]
header = lines[0]
data = []
for line in lines[1:]:
if len(line):
data.append([float(x) for x in line.split()])
data = np.array(data)
fmt = extract_format(lines[1]) # similar to a FORTRAN format statment
filename = 'stack21865757.txt'
with open(filename,'w') as f:
print(header,file=f)
np.savetxt(f, data, fmt)
with open(filename) as f:
print (f.read())
producing
This is a header. The first line is always a header...
7352.103 26.0 2.61 -8.397 11.2
7353.510 26.0 4.73 -1.570 3.5
7356.643 26.0 5.75 -2.964 9.0
7356.648 26.0 5.35 -3.187 9.0
7364.034 26.0 5.67 -5.508 1.7
7382.523 26.0 5.61 -3.935 1.9
Upvotes: 2
Reputation: 1396
If your header is always the same, then you could look into pandas. This would allow you to move columns around really easily just by knowing the name of the column from the header. Even if the header isn't always the same, if you could get the headers from the template, then it could still rearrange it.
If I have misunderstood the question, then I am sorry, but more concrete data or a longer example might be nice for more help.
Upvotes: 1