Reputation: 4163
I have a very huge javascript file which I was trying to analyze. The file had a lot of code with newlines removed and it was becoming hard to analyze the file so I used the replace function to find all the instances of ;
and replaced it with ;\u000A
(\u000A is the unicode for newline). This solved my problem and the program become more readable. However I had another problem now: Every for
loop got changed.
For instance:
for(i=0; i<someValue; i++)
got changed to
for(i=0;
i<someValue;
i++)
I want to write a program in Python to format this mistake. My thinking was along the lines:
for line in open('index.html', 'r+'):
if line.startswith('for(') and line.endswith(';'):
line.strip('\n')
However, I don't know what code do I use to strip the next lines newline character as the for
loop would only read one line at a time. Could anyone please suggest what would I be requiring to do?
Upvotes: 0
Views: 1254
Reputation: 1121396
A Python file object is an iterable, you can ask it for the next line while looping:
with open(inputfilename) as ifh:
for line in ifh:
if line.startswith('for(') and line.endswith(';\n'):
line = line.rstrip('\n') + next(ifh).rstrip('\n') + next(ifh)
This uses the next()
function to retrieve the next two items from the ifh
file object and add them to the current line. The outer loop will continue with the line after that.
To illustrate, look at the output of this iterator loop:
>>> lst = [1, 2, 3, 4]
>>> lst_iter = iter(lst)
>>> for i in lst_iter:
... print i
... if i == 2:
... print 'skipping ahead to', next(lst_iter)
...
1
2
skipping ahead to 3
4
Here next()
advanced the lst_iter
iterable to the next item, and the outer for
loop then continued with the next value after that.
Your next problem is rewriting the file in-place; you cannot read and write to the same file at the same time, and hope to replace just the right parts. Buffering and different line lengths get in the way.
Use the fileinput
module to handle replacing the contents of a file:
import sys
import fileinput
for line in fileinput.input(inputfilename):
if line.startswith('for(') and line.endswith(';'):
line = line.rstrip('\n') + next(ifh).rstrip('\n') + next(ifh)
sys.stdout.write(line)
or use my in-place file rewriting context manager.
from inplace import inplace
with inplace(inputfilename) as (ifh, ofh):
for line in ifh:
if line.startswith('for(') and line.endswith(';'):
line = line.rstrip('\n') + next(ifh).rstrip('\n') + next(ifh)
ofh.write(line)
Upvotes: 1
Reputation: 2198
You can use a counter, like this:
cnt = 2
for line in open('index.html'):
if(line.startswith('for(') and line.endswith(';\n')):
cnt = 0
if cnt < 2:
line = line.strip('\n')
cnt += 1
Upvotes: 0