Reputation: 31
Let's say there is a file that has lines of numbers and comments like:
#comments
12 #this is number
2.4 #this is float
Read the file and append the digits to the list. I'm trying to get just the digits, but somehow it appends the #this is number and #this is float.
Upvotes: 0
Views: 177
Reputation: 94595
With such a simple case, you do not have to use the more complex and slower machinery of regular expressions (re
module). str.split()
is your friend:
output = []
with open('somefile.txt') as f:
for line in f:
parts = line.split('#', 1) # Maximum 1 split, on comments
try:
output.append(float(parts[0])) # The single, or pre-comment part is added
except ValueError: # Beginning is not float-like: happens for "# comment", " # comment", etc.
pass # No number found
This automatically handles all the possible syntaxes for floats (1.1e2
, nan
, -inf
, 3
, etc.). It works because float()
is quite powerful: it handles trailing spaces and newlines (by ignoring them).
This is also quite efficient, because a try
that does not fail is fast (faster than an explicit test, usually).
This also handles comments found in the middle of the file. If you only have a pure comment at the beginning of the file, we can simplify the code and use the fact that each line is guaranteed to have a number:
output = []
with open('somefile.txt') as f:
next(f) # Skips the first, comment line
for line in f:
output.append(float(line.split('#', 1)[0])) # The single or pre-comment part is guaranteed to be a float representation
I don't think that there is any explicit approach which is much simpler than this (beyond calculating possibly too many line parts with split('#')
instead).
That said, an implicit approach can be considered, like that of abathur, where eval(line)
replaces the whole float(…)
part; however, in this case, the code does not show that floats are expected, and as the Zen of Python says, "Explicit is better than implicit.", so I do not recommend to use the eval()
approach, unless it is for a one-shot, quick and dirty script.
Upvotes: 2
Reputation: 298512
You could use split
:
>>> 'foo #comment'.split('#', 1)[0]
'foo '
>>> 'foo comment'.split('#', 1)[0]
'foo comment'
Upvotes: 4
Reputation: 1047
While the others have covered the file reading logistics, I just wanted to note another approach: Assuming the file you have follows Python's syntax, you could use the eval function to get the value of the line minus the comments.
>>> eval("10 #comment")
10
Keep in mind of course that there are security considerations with eval()
as it executes Python code and arbitrary code execution could be a vulnerability for your program if you have less control over the data file than you do over the script you're executing it with.
Upvotes: 0