Reputation: 1124
I am processing a very large log file to extract information using Python regex. However, I would like to process all the lines only after I find a particular string, which in this case is Starting time loop
. The minimal version of the log file is as follows:
Pstream initialized with:
floatTransfer : 0
nProcsSimpleSum : 0
commsType : nonBlocking
polling iterations : 0
sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using timeStampMaster
allowSystemOperations : Disallowing user-supplied system call operations
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time
Create mesh for time = 0
PIMPLE: Operating solver in PISO mode
Reading g
Reading relaxProperties
Reading field p_rgh
Reading field alpha1
Reading field Urel
Reading/calculating face flux field phi
Reading transportProperties
Selecting incompressible transport model Newtonian
Selecting incompressible transport model Newtonian
Selecting turbulence model type LESModel
Selecting LES turbulence model Smagorinsky
Selecting LES delta type vanDriest
Selecting LES delta type cubeRootVol
SmagorinskyCoeffs
{
ce 1.048;
ck 0.02;
}
Reading STFProperties
Calculating field g.h
time step continuity errors : sum local = 8.4072346e-06, global = -1.5271655e-21, cumulative = -1.5271655e-21
GAMGPCG: Solving for pcorr, Initial residual = 1, Final residual = 4.7194845e-06, No Iterations 9
GAMGPCG: Solving for pcorr, Initial residual = 0.13716381, Final residual = 2.9068099e-06, No Iterations 6
time step continuity errors : sum local = 1.3456802e-10, global = -6.7890391e-13, cumulative = -6.7890392e-13
Courant Number mean: 0.021611246 max: 0.39023401
fieldAverage fieldAverage1:
Starting averaging at time 0
Starting time loop
Courant Number mean: 0.02156811 max: 0.3894551
Interface Courant Number mean: 0 max: 0
deltaT = 0.00022522523
Time = 0.000225225
Currently, the test script is as follows:
logf = open(logName, 'r')
p = logf.tell()
logf.seek(0, 0)
for l in logf:
if l.startswith('Starting time loop'):
print l
However, print l
prints all the lines from the log file. Note that log file is opened as logf
Upvotes: 0
Views: 230
Reputation: 214949
The nice thing about python iterators (to which file objects belong) is that they keep state, so if you have two for
loops, the second one starts when the first has stopped. This leads to the following conventional pattern:
for line in logf:
if <some condition>
break
for line in logf:
process lines after that one
Another, more concise way to do that is itertools.dropwhile
.
Upvotes: 5
Reputation: 2348
Without seeing the exact way you opened the log file, it's hard to give good feedback on your little script.
However, here is a little script that works as you requested:
#!/usr/bin/env python
logfile = 'logfile'
start_line = 'Starting time loop'
started = False
with open(logfile) as f:
for l in f.readlines():
if l.startswith(start_line):
started = True
if started:
print l.strip()
Here is a sample log file:
$ cat logfile
This is the first line
This is the 2nd line
This is the 3rd non-blank line
Starting time loop and here we go
Here are some more lines
and some more
yadda yadda yadda
yadda yadda yadda
yadda yadda yadda
...
And.. we're done
Finally, here's the run of the little log script:
$ ./log.py
Starting time loop and here we go
Here are some more lines
and some more
yadda yadda yadda
yadda yadda yadda
yadda yadda yadda
...
And.. we're done
Upvotes: 2
Reputation: 5696
Code below reads one line at a time. When end of file is reached, line
is empty string, and loop breaks.
with open('your_file.txt', 'r') as opened_file:
while True:
line = opened_file.readline()
if not line:
break
else:
# Your code goes here
if line.startswith('Starting time loop'):
print line
break
It might be better if you use with open()
instead, since it closes file automatically when done.
Upvotes: 1