Hansel
Hansel

Reputation: 31

Ignoring a comment line in file

Let's say there is a file that has lines of numbers and comments like:

#comments
12 #this is number
2.4 #this is float

Read the file and append the digits to the list. I'm trying to get just the digits, but somehow it appends the #this is number and #this is float.

Upvotes: 0

Views: 177

Answers (3)

Eric O. Lebigot
Eric O. Lebigot

Reputation: 94595

With such a simple case, you do not have to use the more complex and slower machinery of regular expressions (re module). str.split() is your friend:

output = []

with open('somefile.txt') as f:
    for line in f:
        parts = line.split('#', 1)  # Maximum 1 split, on comments
        try:
            output.append(float(parts[0]))  # The single, or pre-comment part is added
        except ValueError:  # Beginning is not float-like: happens for "# comment", "    # comment", etc.
            pass  # No number found

This automatically handles all the possible syntaxes for floats (1.1e2, nan, -inf, 3, etc.). It works because float() is quite powerful: it handles trailing spaces and newlines (by ignoring them).

This is also quite efficient, because a try that does not fail is fast (faster than an explicit test, usually).

This also handles comments found in the middle of the file. If you only have a pure comment at the beginning of the file, we can simplify the code and use the fact that each line is guaranteed to have a number:

output = []

with open('somefile.txt') as f:
    next(f)  # Skips the first, comment line
    for line in f:
        output.append(float(line.split('#', 1)[0]))  # The single or pre-comment part is guaranteed to be a float representation

I don't think that there is any explicit approach which is much simpler than this (beyond calculating possibly too many line parts with split('#') instead).

That said, an implicit approach can be considered, like that of abathur, where eval(line) replaces the whole float(…) part; however, in this case, the code does not show that floats are expected, and as the Zen of Python says, "Explicit is better than implicit.", so I do not recommend to use the eval() approach, unless it is for a one-shot, quick and dirty script.

Upvotes: 2

Blender
Blender

Reputation: 298512

You could use split:

>>> 'foo #comment'.split('#', 1)[0]
'foo '
>>> 'foo comment'.split('#', 1)[0]
'foo comment'

Upvotes: 4

abathur
abathur

Reputation: 1047

While the others have covered the file reading logistics, I just wanted to note another approach: Assuming the file you have follows Python's syntax, you could use the eval function to get the value of the line minus the comments.

>>> eval("10 #comment")
10

Keep in mind of course that there are security considerations with eval() as it executes Python code and arbitrary code execution could be a vulnerability for your program if you have less control over the data file than you do over the script you're executing it with.

Upvotes: 0

Related Questions