Reputation: 21

Removing blank lines and comments in Python (w/o Regex)

I am a beginner (3rd week of coding) and have the following problem:

I want to remove all comments and blank lines from a list created w/ .readlines(). However, my approach seems to be wrong and I don't know how to proceed.

The contents of the .txt file is:

subject1: 3.5

# comment hello


subject2: 4.25
subject3:5.20


subject4:           4.75

And my code:

import os

def get_average_grade(path):
    if not os.path.exists(path):
        return None
    with open(path, "r") as file:
        gradelist = file.readlines()
        print(gradelist)
        amount = 0
        for item in gradelist:
            if item[0] == "#" or item[0] == "\n":
                gradelist.remove(item)

    print(gradelist)

Output, which should only contain subjects and corresponding grades (Grade in Switzerland contains floats from 1-6):

My actual output

['subject1: 3.5\n', '\n', '# comment hello\n', '\n', '\n', 'subject2: 4.25\n', 'subject3:5.20\n', '\n', '\n', 'subject4:           4.75\n', '\n']

['subject1: 3.5\n', '# comment hello\n', 'subject2: 4.25\n', 'subject3:5.20\n', '\n', 'subject4:           4.75\n', '\n']

My expected output

['subject1: 3.5\n', '\n', '# comment hello\n', '\n', '\n', 'subject2: 4.25\n', 'subject3:5.20\n', '\n', '\n', 'subject4:           4.75\n', '\n']

['subject1: 3.5', 'subject2: 4.25', 'subject3:5.20', 'subject4:          4.75']

As you can see the comments and some blank lines stay on the list of my actual output.

Any help is gladly appreciated! Thank you.

Upvotes: 1

Answers (4)

KittoMi

Reputation: 459

I think this will help you. It works properly as you want.

import os

def get_average_grade(path):
    if not os.path.exists(path):
        return None
    with open(path, "r") as file:
        gradelist = []
        amount = 0
        for item in file:
            if item.startswith("#") or item.startswith("\n"):
                continue
            else:
                item = item.strip()
                gradelist.append(item)

    print(gradelist)

**Output **

['subject1: 3.5', 'subject2: 4.25', 'subject3:5.20', 'subject4:           4.75']

Upvotes: 0

baduker

Reputation: 20042

You might want to strip the lines and skip those that start with #.

Try this:

with open("sample.txt") as f:
    sample = f.readlines()

print([l.strip() for l in sample if not l.startswith("#") and l.strip() != ""])

Then, if you feel like it, you can create a dictionary with all subjects and marks:

with open("sample.txt") as f:
    sample = f.readlines()

cleaned_up = [l.strip() for l in sample if not l.startswith("#") and l.strip() != ""]

data = {}
for item in cleaned_up:
    subject, mark = item.split(":")
    subject = subject.strip()
    data.setdefault(subject, []).append(mark.strip())

print(data)

Output:

{'subject1': ['3.5'], 'subject2': ['4.25'], 'subject3': ['5.20'], 'subject4': ['4.75']}

Upvotes: 1

Daan Klijn

Reputation: 1684

Although the other answers try to solve this using list comprehensions I believe it's probably a bit more clear to do it without them.

The code is similar to ijn's code except that instead of removing the invalid items from the gradelist, it appends valid items to an empty list.

def get_average_grade(path):
    if not os.path.exists(path):
        return None
    with open(path, "r") as file:
        text = file.read()
        splitted = text.split('\n')
        gradelist = []

        for item in splitted:
            if item != "" and item[0] != "#":
                gradelist.append(item)

    print(gradelist)

Upvotes: 1

Suraj

Reputation: 2477

with open('file.txt', 'r') as f:
    lines = f.read()

subjects = [row for row in lines.split('\n') if '#' not in row and row]
subjects = {row.split(':')[0].strip():row.split(':')[1].strip() for row in subjects}

Output :

>> subjects
{'subject1': '3.5', 'subject2': '4.25', 'subject3': '5.20', 'subject4': '4.75'}

The code is pretty simple. You remove all the \n, empty rows and rows containing #. The you split on : and create a dictionary.

Upvotes: 0

Removing blank lines and comments in Python (w/o Regex)

Answers (4)

Related Questions