Reputation: 1
basically i have a massive text file that has several lines that have nothing on them but an '@' symbol.
i want to print every line that precedes the FIRST line that is nothing but a single '@' symbol.
i'm new to python but pretty familiar with regex but i just can't figure this out. here's what i've got so far:
original = open('oldfile.txt')
for each_line in original:
pattern = re.compile("(^.*)(^@\s)", re.M)
m = re.match(pattern, each_line).group(1)
print(m)
original.close
i swear i have been reading the python online docs and other stackoverflow articles for an hour and a half and somehow i'm still not getting this.
the result of that code is:
AttributeError: 'NoneType' object has no attribute 'group'
Upvotes: 0
Views: 62
Reputation: 89097
You don't need regular expressions here, it's actually pretty simple:
with open('file.txt') as file:
for line in file:
line = line.rstrip("\n")
if line == "@":
break
print(line)
We open the file (using the with
statement, which is both more readable, and ensures the file is closed, even if an exception occurs), then we loop through the lines in the file. We break out of the loop if the line is just "@"
, otherwise, we print the line and continue.
As pointed out in the comments, we need to strip the newline character off the line (or check against "@\n"
(if we did that, we would also need to do print(line, end="")
in 3.x or print line,
in 2.x to stop print()
adding an extra newline).
As Martijn Piters points out, there is another way to do this, using the takewhile()
function from itertools
. This takes items from an iterable until a condition is met, which is exactly what we want here:
import itertools
with open('file.txt') as file:
for line in itertools.takewhile(lambda x: x != "@\n", file):
print(line, end="")
I would argue that for just printing the values out, this is harder to read, there might be cases where it's useful, however (for example, if you wish to make a list of the values, or pass them into another function, having them as an iterable is useful).
Upvotes: 3
Reputation: 65871
As Lattyware has mentioned, you don't need regex for this.
As for the problem with your code,
when the string doesn't match the pattern, re.match
returns None
rather than a match object. You can call the group
attribute in that case. That's the reason of the exception: None
, which is an instance (the instance) of type NoneType
, doesn't have an attribute group
.
Also the re.compile
statement should be outside the loop, otherwise there's not much point in compiling the regex explicitly.
Upvotes: 0