Reputation: 577
Here is my code:
import re
def get_email_answers(path):
for line in path:
clear = line.strip()
if re.match(r".*\s.*\t(Antw.+)\t.*Uhr", clear):
subject = re.findall(r".*\s.*\t(Antw.+)\t.*Uhr", clear)
print(subject)
def get_sizes(path):
for line in path:
clear = line.strip()
if re.match(r".*\s([0-9][0-9]\s[MKG]B)", clear):
size = re.findall(r".*([0-9][0-9]\s[MKG]B)", clear)
print(size)
elif re.match(r".*\s([0-9][0-9][0-9]\s[MKG]B)", clear):
size = re.findall(r".*([0-9][0-9][0-9]\s[MKG]B)", clear)
print(size)
elif re.match(r".*\s([0-9]\s[MKG]B)", clear):
size = re.findall(r".*([0-9]\s[MKG]B)", clear)
print(size)
elif re.match(r".*(.\.[0-9][0-9]\s[MKG]B)", clear):
size = re.findall(r".*(.\.[0-9][0-9]\s[MKG]B)", clear)
print(size)
file_opener = open(r"C:\Users\julia\Documents\RegEX-Test.txt", "r")
get_sizes(file_opener)
get_email_answers(file_opener)
The function get_sizes works, but the function get_email_answers doesn't. If you comment the function get_sizes out, then get_email_answers works perfectly. If you put get_email_answers before get_sizes, then get_sizes doesn't work and get_email_answers does.
I have done this:
def get_email_answers(path):
print(path) #modified here
for line in path:
print("line") #and here
clear = line.strip()
if re.match(r".*\s.*\t(Antw.+)\t.*Uhr", clear):
subject = re.findall(r".*\s.*\t(Antw.+)\t.*Uhr", clear)
print(subject)
The printed path is the same as in get_sizes. But, the for-loop didn't run! Why? And why does it, when you comment the other function get_sizes out?
Upvotes: 1
Views: 81
Reputation: 538
Reading files is a sequential process. When you open a file, an internal "pointer" is created, rembering where in the file you are - at first it points to the beginning of file, and each time you read a chunk of it, the "pointer"moves past this chunk and points to the first byte that hasn't been read yet. So, after one of your functions reads the file, this pointer is set to the end of it, and when second function tries to read it, it seems empty. You need to reset this pointer between readings, invoking file_opener.seek(0)
.
Btw. file_opener
is a slightly confusing name - this variable holds the file object itself, not some object offering functionality to open a file.
Upvotes: 2
Reputation: 5888
You can only read a file-object once. I have to stored the file data in a variable: data = file_opener.read()
and iterate over that or you need to return the file pointer at the end of a function.
Try this:
get_sizes(file_opener)
file_opener.seek(0)
get_email_answers(file_opener)
To clarify: the issue isn't in the functions, it's in the way your handle the input file.
Upvotes: 1