Reputation: 23
I'm trying to write a code that will search for specific data from multiple report files, and write them into columns in a single csv.
The report file lines i'm looking for aren't always on the same line, so i'm looking for the data associated on the lines below:
Estimate file: pog_example.bef
Estimate ID: o1_p1
61078 (100.0%) estimated.
And I want to write the data from each text file into columns in a csv as below:
example.bef, o1_p1, 61078 (100.0%) estimated
So far I have this script which will list out the first of my criteria, but I can't figure out how to loop it through to find my second and third lines to populate the second and third columns
from glob import glob
import fileinput
import csv
with open('percentage_estimated.csv', 'w', newline='') as est_report:
writer = csv.writer(est_report)
for line in fileinput.input(glob('*.bef*')):
if 'Estimate file' in line:
writer.writerow([line.split('pog_')[1].strip()])
I'm pretty new to python so any help would be appreciated!
Upvotes: 0
Views: 724
Reputation: 23
if anyone wants to see what finally worked for me
from glob import glob
import csv
all_rows = []
with open('percentage_estimated.csv', 'w', newline='') as bef_report:
writer = csv.writer(bef_report)
writer.writerow(['File name', 'Est ID', 'Est Value'])
for file in glob('*.bef*'):
with open(file,'r') as f:
for line in f:
if 'Estimate file' in line:
fname = line.split('pog_')[1].strip()
line = next(f)
est_id = line.split('Estimate ID:')[1].strip()
line = next(f)
line = next(f)
line = next(f)
line = next(f)
line = next(f)
line = next(f)
line = next(f)
value = line.strip()
row = [fname, est_id, value]
all_rows.append(row)
break
writer.writerows(all_rows)
Upvotes: 1
Reputation: 11188
I think I see what you're trying to do, but I'm not sure.
I think your BEF file might look something like this:
a line
another line
Estimate file: pog_example.bef
Estimate ID: o1_p1
61078 (100.0%) estimated.
still more lines
If that's true, then once you find a line with 'Estimate file'
, you need to take control from the for-loop and start manually iterating the lines because you know which lines are coming up.
This is a very simple example script which opens my mock BEF file (above) and automatically iterates the lines till it finds 'Estimate file'
. From there it processes each line specifically, using next(bef_file)
to iterate to the next line, expecting them to have the correct text:
import csv
all_rows = []
bef_file = open('input.bef')
for line in bef_file:
if 'Estimate file' in line:
fname = line.split('pog_')[1].strip()
line = next(bef_file)
est_id = line.split('Estimate ID:')[1].strip()
line = next(bef_file)
value = line.strip()
row = [fname, est_id, value]
all_rows.append(row)
break # stop iterating lines in this file
csv_out = open('output.csv', 'w', newline='')
writer = csv.writer(csv_out)
writer.writerow(['File name', 'Est ID', 'Est Value'])
writer.writerows(all_rows)
When I run that I get this for output.csv:
File name,Est ID,Est Value
example.bef,o1_p1,61078 (100.0%) estimated.
If there are blank lines in your data between the lines you care about, manually step over them with next(bef_file)
statements.
Upvotes: 1