BigBlue
BigBlue

Reputation: 39

Find and remove specific string from a line

I am hoping to receive some feedback on some code I have written in Python 3 - I am attempting to write a program that reads an input file which has page numbers in it. The page numbers are formatted as: "[13]" (this means you are on page 13). My code right now is:

pattern='\[\d\]'

for line in f:
if pattern in line:
    re.sub('\[\d\]',' ')
    re.compile(line)
    output.write(line.replace('\[\d\]', ''))

I have also tried:

for line in f:
    if pattern in line:
    re.replace('\[\d\]','')
    re.compile(line)
    output_file.write(line)

When I run these programs, a blank file is created, rather than a file containing the original text minus the page numbers. Thank you in advance for any advice!

Upvotes: 0

Views: 33

Answers (2)

Soviut
Soviut

Reputation: 91545

Your if statement won't work because not doing a regex match, it's looking for the literal string \[\d\] in line.

for line in f:
    # determine if the pattern is found in the line
    if re.match(r'\[\d\]', line):
        subbed_line = re.sub(r'\[\d\]',' ')
        output_file.writeline(subbed_line)

Additionally, you're using the re.compile() incorrectly. The purpose of it is to pre-compile your pattern into a function. This improves performance if you use the pattern a lot because you only evaluate the expression once, rather than re-evaluating each time you loop.

pattern = re.compile(r'\[\d\]')

if pattern.match(line):
    # ...

Lastly, you're getting a blank file because you're using output_file.write() which writes a string as the entire file. Instead, you want to use output_file.writeline() to write lines to the file.

Upvotes: 1

Christoph Bluoss
Christoph Bluoss

Reputation: 359

You don't write unmodified lines to your output.

Try something like this

if pattern in line:
    #remove page number stuff
output_file.write(line) # note that it's not part of the if block above

That's why your output file is empty.

Upvotes: 0

Related Questions