Reputation: 131
I am using Python 3 and I need to extract data from a file. An example of the data is below:
ENERGY_BOUNDS
1.964033E+07 1.733253E+07 1.491825E+07 1.384031E+07 1.161834E+07 1.000000E+07 8.187308E+06 6.703200E+06
6.065307E+06 5.488116E+06 4.493290E+06 3.678794E+06 3.011942E+06 2.465970E+06 2.231302E+06 2.018965E+06
GAMMA_INTERFACE
0
EIGENVALUE
1.219034E+00
N,2N
1.191994E+00 1.535081E+00 1.543891E+00 1.413861E+00 1.181815E+00 6.174152E-01 1.302440E-02 0.000000E+00
0.000000E+00 0.000000E+00 0.000000E+00 0.000000E+00 0.000000E+00 0.000000E+00 0.000000E+00 0.000000E+00
0.000000E+00 0.000000E+00 0.000000E+00 0.000000E+00 0.000000E+00 0.000000E+00 0.000000E+00 0.000000E+00
MACRO 1
SIGABS
-3.826074E-03 -3.707513E-04 2.610351E-03 6.961084E-03 7.832982E-03 7.512567E-03 1.018417E-02 1.276596E-02
9.148128E-03 8.828235E-03 8.527789E-03 7.514346E-03 7.544248E-03 7.801064E-03 7.724884E-03 7.047571E-03
5.280749E-03 3.999751E-03 3.821688E-03 3.748186E-03 3.712753E-03 3.591795E-03 3.390300E-03 3.180354E-03
SIGTRAN
7.513455E-02 8.061355E-02 8.377954E-02 8.787775E-02 9.114071E-02 9.170817E-02 9.440786E-02 9.535947E-02
1.010975E-01 1.035364E-01 1.160553E-01 1.290131E-01 1.197249E-01 1.151962E-01 1.298934E-01 1.375417E-01
1.428861E-01 1.715100E-01 1.627465E-01 2.026621E-01 2.007540E-01 1.644982E-01 1.781501E-01 1.624188E-01
The process needs to be:
Search the file line by line until the starting keyword (MACRO in this case) is found.
After this, continue searching line by line until the specific identifier is found.
Read each value in the lines after the identifier into an array or list.
Stop reading once another identifier is found.
So far this is what I have. The code works fine if the identifier is the first value after MACRO (e.g. if it is SIGABS) but not for any others (e.g. SIGTRAN). My results file has maybe 50 different identifiers in it so I need the code to be able to pick out one at a time.
def read_data_from_file_macro(file_name, start_macro, identifier):
with open(file_name, 'r') as read_obj:
list_of_results = []
# Read all lines in the file one by one
for line in read_obj:
# For each line, check if line contains the string
if start_macro in line:
# If MACRO is found, start looking for the identifier read the next line
nextValue = next(read_obj)
if identifier in nextValue:
# If identifier is found read next line
nextValue = next(read_obj)
while(not nextValue.strip().isidentifier()): #keep on reading untill next identifier appears
list_of_results.extend(nextValue.split())
nextValue = next(read_obj)
# Convert to float
for i in range(0, len(list_of_results)):
list_of_results[i] = float(list_of_results[i])
return(list_of_results)
Upvotes: 1
Views: 105
Reputation: 10624
Try the following. It handles you file as text, separates the part that is included between your start_identifier and end_identifier and with some work it returns a list of floats that is finally extended in your list_of_results (which must preexist before the function is called, so you must firstly create it manually). You can run for any pair of identifiers. Let me know how it works
def read_data_from_file_macro(file_name, start_identifier, end_identifier):
with open(file_name) as f:
t=f.read()
t=t[t.find('MACRO'):]
t=t[t.find(start_identifier)+len(start_identifier):t.find(end_identifier)]
t=t.replace('\n', '').split()
t=[float(i) for i in t if not i.isidentifier()]
list_of_results.extend(t)
Upvotes: 1