I'm trying to process text files in python. The file structure looks somewhat like this: info to process info to process START ... END info to process START ... END I need to process the file line by line (i'm using simple "for line in file" for that) but i also need to remove anything that's between START and END. The most similar problem i found here would be this one here The problem is that: This does search file as a whole. I need to process line by line It's not python code and as a newbie i couldn't translate it I thought about adding variable, setting it to true when it meets START and to false when it meets END and save output based on this variable, but this seems very not-python-like way to implement this. I expect the end file to look like this Processed info Processed info Processed info

Reputation: 13

Python: Searching for text between lines with keywords

I'm trying to process text files in python. The file structure looks somewhat like this:

info to process
info to process
START
...
END
info to process
START
...
END

I need to process the file line by line (i'm using simple "for line in file" for that) but i also need to remove anything that's between START and END.

The most similar problem i found here would be this one here The problem is that:

This does search file as a whole. I need to process line by line
It's not python code and as a newbie i couldn't translate it

I thought about adding variable, setting it to true when it meets START and to false when it meets END and save output based on this variable, but this seems very not-python-like way to implement this.

I expect the end file to look like this

Processed info
Processed info

Processed info

Upvotes: 1

Answers (2)

mttpgn

Reputation: 372

Personally, I don't understand what you mean by characterizing your proposed solution as "very not-python-like."

I implemented your suggestion as follows and got the outcome you expected:

with open('test.txt', 'r') as f_orig, open('test2.txt', 'w') as f_new:
    for line in f_orig:
        if line[:5] == 'START':
            skipping = True
        if not skipping:
            f_new.write(line)
        if line[:3] == 'END':
            skipping = False

Upvotes: 1

logi-kal

Reputation: 7880

Try with this:

oldtext = '''info to process
info to process
START
...
END
info to process
START
...
END'''

newtext = re.sub(r"(?ms)^START$.*?^END$", "", oldtext)

See here for a demo.

Upvotes: 1

Python: Searching for text between lines with keywords

Answers (2)

Related Questions