alvas
alvas

Reputation: 122142

Remove comment blocks bounded by `#|...|#` in textfile - python

How could I remove commented blocks from a text files, the comments are encased within #| and |#\n?

Infile:

#|\n this is some sort of foo bar\n that I don't care about|#\nthen there is a foo bar sentence that I want but i don't want that foo bar in within the hex pipe pipe hex comment block.#| and even so, i don't want this section to appear|#\n with some crazy sentence...

Desired output:

then there is a foo bar sentence that I want but i don't want that foo bar in within the hex pipe pipe hex comment block. with some crazy sentence...

Is there a better way to remove the comment blocks other than the following?

txt = '''#|\n this is some sort of foo bar\n that I don't care about|#\nthen there is a foo bar sentence that I want but i don't want that foo bar in within the hex pipe pipe hex comment block.#| and even so, i don't want this section to appear|#\n with some crazy sentence...'''

pointer = 0
while pointer < len(txt):
    try:
        start = txt.index('#|',pointer)
        end = txt.index('|#\n',start)
        cleantxt+=txt[pointer:start]
        pointer = end+3
    except ValueError:
        cleantxt+=txt[pointer:]
        break

Upvotes: 1

Views: 114

Answers (1)

augustomen
augustomen

Reputation: 9749

You can use regex:

>>> import re
>>> txt = '''#|\n this is some sort of foo bar\n that I don't care about|#\nthen there is a foo bar sentence that I want but i don't want that foo bar in within the hex pipe pipe hex comment block.#| and even so, i don't want this section to appear|#\n with some crazy sentence...'''
>>> txt2 = re.sub(r'#\|.*?\|#', '', txt, flags=re.DOTALL)  # remove multiline comment
>>> txt2
"\nthen there is a foo bar sentence that I want but i don't want that foo bar in within the hex pipe pipe hex comment block.\n with some crazy sentence..."

You may also strip() the result to remove unwanted line breaks.

Upvotes: 1

Related Questions