Reputation: 21
I am a beginner (3rd week of coding) and have the following problem:
I want to remove all comments and blank lines from a list created w/ .readlines()
. However, my approach seems to be wrong and I don't know how to proceed.
The contents of the .txt
file is:
subject1: 3.5
# comment hello
subject2: 4.25
subject3:5.20
subject4: 4.75
And my code:
import os
def get_average_grade(path):
if not os.path.exists(path):
return None
with open(path, "r") as file:
gradelist = file.readlines()
print(gradelist)
amount = 0
for item in gradelist:
if item[0] == "#" or item[0] == "\n":
gradelist.remove(item)
print(gradelist)
Output, which should only contain subjects and corresponding grades (Grade in Switzerland contains floats from 1-6):
My actual output
['subject1: 3.5\n', '\n', '# comment hello\n', '\n', '\n', 'subject2: 4.25\n', 'subject3:5.20\n', '\n', '\n', 'subject4: 4.75\n', '\n']
['subject1: 3.5\n', '# comment hello\n', 'subject2: 4.25\n', 'subject3:5.20\n', '\n', 'subject4: 4.75\n', '\n']
My expected output
['subject1: 3.5\n', '\n', '# comment hello\n', '\n', '\n', 'subject2: 4.25\n', 'subject3:5.20\n', '\n', '\n', 'subject4: 4.75\n', '\n']
['subject1: 3.5', 'subject2: 4.25', 'subject3:5.20', 'subject4: 4.75']
As you can see the comments and some blank lines stay on the list of my actual output.
Any help is gladly appreciated! Thank you.
Upvotes: 1
Views: 595
Reputation: 459
I think this will help you. It works properly as you want.
import os
def get_average_grade(path):
if not os.path.exists(path):
return None
with open(path, "r") as file:
gradelist = []
amount = 0
for item in file:
if item.startswith("#") or item.startswith("\n"):
continue
else:
item = item.strip()
gradelist.append(item)
print(gradelist)
**Output **
['subject1: 3.5', 'subject2: 4.25', 'subject3:5.20', 'subject4: 4.75']
Upvotes: 0
Reputation: 20042
You might want to strip the lines and skip those that start with #
.
Try this:
with open("sample.txt") as f:
sample = f.readlines()
print([l.strip() for l in sample if not l.startswith("#") and l.strip() != ""])
Then, if you feel like it, you can create a dictionary with all subjects and marks:
with open("sample.txt") as f:
sample = f.readlines()
cleaned_up = [l.strip() for l in sample if not l.startswith("#") and l.strip() != ""]
data = {}
for item in cleaned_up:
subject, mark = item.split(":")
subject = subject.strip()
data.setdefault(subject, []).append(mark.strip())
print(data)
Output:
{'subject1': ['3.5'], 'subject2': ['4.25'], 'subject3': ['5.20'], 'subject4': ['4.75']}
Upvotes: 1
Reputation: 1684
Although the other answers try to solve this using list comprehensions I believe it's probably a bit more clear to do it without them.
The code is similar to ijn's code except that instead of removing the invalid items from the gradelist, it appends valid items to an empty list.
def get_average_grade(path):
if not os.path.exists(path):
return None
with open(path, "r") as file:
text = file.read()
splitted = text.split('\n')
gradelist = []
for item in splitted:
if item != "" and item[0] != "#":
gradelist.append(item)
print(gradelist)
Upvotes: 1
Reputation: 2477
with open('file.txt', 'r') as f:
lines = f.read()
subjects = [row for row in lines.split('\n') if '#' not in row and row]
subjects = {row.split(':')[0].strip():row.split(':')[1].strip() for row in subjects}
Output :
>> subjects
{'subject1': '3.5', 'subject2': '4.25', 'subject3': '5.20', 'subject4': '4.75'}
The code is pretty simple. You remove all the \n
, empty rows and rows containing #
. The you split on :
and create a dictionary.
Upvotes: 0