user1583260
user1583260

Reputation: 21

Python - Removing certain lines from input file based on content

I'm a beginner with python, and I'm trying to learn through google and some books... I'm working on a specific project and doing ok with it so far...

The first part of my program takes an input text file, and scans it for certain data within the lines, it then writes the line back out to a new file if it doesn't satisfy the search criteria...

What I've done is ugly as hell, but it's also very slow... When I run it on a Raspberry Pi, this part takes 4 seconds alone (input file is just over 1700 lines of text)

Here's my effort:

    with open('mirror2.txt', mode='r') as fo:
        lines = fo.readlines()
        with open('temp/data.txt', mode='w') as of:
            for line in lines:
                date = 0
                page = 0
                dash = 0
                empty = 0
                if "Date" in line: date += 1
                if "Page" in line: page += 1
                if "----" in line: dash += 1
                if line == "\n":   empty += 1
                sum = date + page + dash + empty
                if sum == 0:
                    of.write(line)
                else:()

I'm embarrassed to show that in public, but I'd love to see a 'pythonic' way to do it more elegantly (and quicker!)

Anyone help?

Upvotes: 2

Views: 839

Answers (2)

koblas
koblas

Reputation: 27048

To answer your question, here's a pythonic way of doing this:

import re

expr = re.compile(r'(?:Date|Page|----|^$)')
with open('mirror2.txt', mode='r') as infile:
    with open('data.txt', mode='w') as outfile:
        for line in infile:
            if not expr.search(line.strip()):
                outfile.write(line)

Upvotes: 2

C2H5OH
C2H5OH

Reputation: 5602

The Pythonic way of reading a file in a line by line basis would be, adapted to your case:

with open('mirror2.txt', mode='r') as fo:
    for line in fo:
        # Rest

If this speeds up your program noticeably, it would mean that the Python interpreter is not doing a very good job at managing memory on ARM processors.

The rest has already been mentioned in comments.

Upvotes: 0

Related Questions