How do I check for next to next line while reading a file in python and strip the newline character at its end?

Question

I have a very huge javascript file which I was trying to analyze. The file had a lot of code with newlines removed and it was becoming hard to analyze the file so I used the replace function to find all the instances of ; and replaced it with ;\u000A (\u000A is the unicode for newline). This solved my problem and the program become more readable. However I had another problem now: Every for loop got changed.

For instance:

for(i=0; i



got changed to 

for(i=0;
i


I want to write a program in Python to format this mistake. My thinking was along the lines:

for line in open('index.html', 'r+'):
    if  line.startswith('for(') and line.endswith(';'):
        line.strip('
')


However, I don't know what code do I use to strip the next lines newline character as the for loop would only read one line at a time. Could anyone please suggest what would I be requiring to do?

Martijn Pieters · Accepted Answer

A Python file object is an iterable, you can ask it for the next line while looping:

with open(inputfilename) as ifh:
    for line in ifh:
        if line.startswith('for(') and line.endswith(';
'):
            line = line.rstrip('
') + next(ifh).rstrip('
') + next(ifh)

This uses the next() function to retrieve the next two items from the ifh file object and add them to the current line. The outer loop will continue with the line after that.

To illustrate, look at the output of this iterator loop:

>>> lst = [1, 2, 3, 4]
>>> lst_iter = iter(lst)
>>> for i in lst_iter:
...     print i
...     if i == 2:
...         print 'skipping ahead to', next(lst_iter)
...
1
2
skipping ahead to 3
4

Here next() advanced the lst_iter iterable to the next item, and the outer for loop then continued with the next value after that.

Your next problem is rewriting the file in-place; you cannot read and write to the same file at the same time, and hope to replace just the right parts. Buffering and different line lengths get in the way.

Use the fileinput module to handle replacing the contents of a file:

import sys
import fileinput

for line in fileinput.input(inputfilename):
    if line.startswith('for(') and line.endswith(';'):
        line = line.rstrip('
') + next(ifh).rstrip('
') + next(ifh)
    sys.stdout.write(line)

or use my in-place file rewriting context manager.

from inplace import inplace

with inplace(inputfilename) as (ifh, ofh):
    for line in ifh:
        if line.startswith('for(') and line.endswith(';'):
            line = line.rstrip('
') + next(ifh).rstrip('
') + next(ifh)
        ofh.write(line)

How do I check for next to next line while reading a file in python and strip the newline character at its end?

Answers (2)

Related Questions