Reputation: 38
I'm formatting GPS output logs and I need an efficient method to remove x number of lines above the line that contains a 0 and y number of lines below that line.
*--------------------------------------*
UTC Time: 000000.00
Latitude: 0000.0000
N/S ind.: N
Longitude: 0000.0000
E/W ind: E
Position fix ind: 0
Satellites Used: 3
MSL Altitude: 00.0
*--------------------------------------*
If the line contains "Position fix ind: 0", remove 6 lines above it and remove 3 lines below in and remove the line it is in
EDIT:
The input file is a .log file
EDIT 2:
input file
1
2
3
*--------------------------------------*
UTC Time: 000000.00
Latitude: 0000.0000
N/S ind.: N
Longitude: 0000.0000
E/W ind: E
Position fix ind: 0
Satellites Used: 3
MSL Altitude: 00.0
*--------------------------------------*
3
2
1
1
2
3
*--------------------------------------*
UTC Time: 000000.00
Latitude: 0000.0000
N/S ind.: N
Longitude: 0000.0000
E/W ind: E
Position fix ind: 5
Satellites Used: 3
MSL Altitude: 00.0
*--------------------------------------*
3
2
1
Upvotes: 1
Views: 1455
Reputation: 143
I needed what @inspectorG4dget provided and for that I will extend my gratitude. But I needed to do the changes in 2500+ files and to the original files themself. I have added an extra function which handles that. The list.txt contains the name of files the changes are to be made to and the temp/tempfile is used for writing temporarily.
from shutil import copyfile
def remLines(infilepath, outfilepath, delim, above, below):
infile = open(infilepath)
outfile = open(outfilepath, 'w')
buff = []
line = infile.readline()
while line:
if line.strip() == delim:
buff = []
for _ in range(below):
infile.readline()
else:
if len(buff) == above:
outfile.write(buff[0])
buff = buff[1:]
buff.append(line)
line = infile.readline()
outfile.write(''.join(buff))
def readfiles(listfilepath, tempfilepath):
refile = open(listfilepath)
line = refile.readline()
while line:
realfilepath = line.strip()
remLines(realfilepath, tempfilepath, 'This is test line 17', 2,7)
copyfile(tempfilepath, realfilepath)
line = refile.readline()
if __name__ == "__main__":
readfiles('list.txt', 'temp/tempfile')
Upvotes: 0
Reputation: 251096
You can use a set
here, iterate over the file and as soon as you see 'Position fix ind: 0'
in a line(say, index of the line is i
), then add a set of numbers from i-6
to i+3
to a set.
f = open('abc')
se = set()
for i,x in enumerate(f):
if 'Position fix ind: 0' in x:
se.update(range(i-6,i+4))
f.close()
Now iterate over the file again and skip those indexes that are present in that set:
f = open('abc')
f1 = open('out.txt', 'w')
for i,x in enumerate(f):
if i not in se:
f1.write(x)
f.close()
f1.cose()
input file:
1
2
3
*--------------------------------------*
UTC Time: 000000.00
Latitude: 0000.0000
N/S ind.: N
Longitude: 0000.0000
E/W ind: E
Position fix ind: 0
Satellites Used: 3
MSL Altitude: 00.0
*--------------------------------------*
3
2
1
1
2
3
*--------------------------------------*
UTC Time: 000000.00
Latitude: 0000.0000
N/S ind.: N
Longitude: 0000.0000
E/W ind: E
Position fix ind: 5
Satellites Used: 3
MSL Altitude: 00.0
*--------------------------------------*
3
2
1
output:
1
2
3
3
2
1
1
2
3
*--------------------------------------*
UTC Time: 000000.00
Latitude: 0000.0000
N/S ind.: N
Longitude: 0000.0000
E/W ind: E
Position fix ind: 5
Satellites Used: 3
MSL Altitude: 00.0
*--------------------------------------*
3
2
1
Upvotes: 0
Reputation: 114025
def remLines(infilepath, outfilepath, delim, above, below):
infile = open(infilepath)
outfile = open(outfilepath, 'w')
buff = []
line = infile.readline()
while line:
if line.strip() == delim:
buff = []
for _ in range(below): # need to error check here, if you're not certain that your input file is correctly formatted
infile.readline()
else:
if len(buff) == above:
outfile.write(buff[0])
buff = buff[1:]
buff.append(line)
line = infile.readline()
outfile.write(''.join(buff))
if __name__ == "__main__":
remLines('path/to/input', 'path/to/output', "Position fix ind: 0", 6,3)
Testing:
Input:
1
2
3
*--------------------------------------*
UTC Time: 000000.00
Latitude: 0000.0000
N/S ind.: N
Longitude: 0000.0000
E/W ind: E
Position fix ind: 0
Satellites Used: 3
MSL Altitude: 00.0
*--------------------------------------*
3
2
1
1
2
3
*--------------------------------------*
UTC Time: 000000.00
Latitude: 0000.0000
N/S ind.: N
Longitude: 0000.0000
E/W ind: E
Position fix ind: 5
Satellites Used: 3
MSL Altitude: 00.0
*--------------------------------------*
3
2
1
Output:
1
2
3
3
2
1
1
2
3
*--------------------------------------*
UTC Time: 000000.00
Latitude: 0000.0000
N/S ind.: N
Longitude: 0000.0000
E/W ind: E
Position fix ind: 5
Satellites Used: 3
MSL Altitude: 00.0
*--------------------------------------*
3
2
1
Upvotes: 3
Reputation: 23374
If the files aren't too large:
import re
p = re.compile(r'(?:.*\n){6}\s*Position fix ind: 0\n(?:.*\n){3}')
with open('test.txt') as f:
output = p.sub('', f.read())
Upvotes: 0