Reputation: 417
I have a function that reads a large .txt file, line by line.
As parameter I give to the function the line index from where it should start reading in file.
First I call the function with 0 so that it will begin from the start. At the end I call again the function with a new parameter, but when it reenters in the function the fresh sent index (which is different now) is still 0 in the for statement. :(
from __future__ import print_function
import os
import sys
file = open("file.txt").read().splitlines()
for i, line in enumerate(file):
if file[i] == "@@@TC_FIN@@@":
fin = i;
#print (fin)
def AssembleTC(index):
while index < fin:
for index, line in enumerate(file):
if "@@@ ID:" in line:
print(file[index+1])
break
for index, line in enumerate(file):
if file[index] == "@@@TC_FIN@@@":
recursive = index;
#print (recursive)
break
AssembleTC(recursive+1)
AssembleTC(0)
It is vital for me to keep the present for statement with file[index] access procedure. I've read that I could skip lines with something like file.next()
but it doesn't work.
Is there any way to skip the number of lines that I want or simply to start the new reading from the updated index? Python 2.7.13 - Thank you!
Upvotes: 4
Views: 4197
Reputation: 417
I have implemented my idea by erasing the lines which I've already parsed and it works very well, but this is only my happy case, because I do not need anymore any data which I have manipulated. For those who will still need it, I think @tdelaney code is good to use, answer for which I thank him!
Here is how I did it:
from __future__ import print_function
import os
import sys
initialCall = os.stat("test.txt").st_size
def AssembleTC(parameter):
print("CALLED PARAMETER = " + str(parameter))
if parameter == 0:
sys.exit()
else:
file = open("test.txt").read().splitlines()
for index, line in enumerate(file):
if file[index] == "@@@TC_FIN@@@":
fin = index;
print ("FIN POSITION = " + str(fin))
break
check = os.stat("test.txt").st_size
print("File size = " + str(check))
while check > 1:
for index, line in enumerate(file):
if "@@@ TC NR" in line:
print(file[index+1])
break
ok=0
with open("test.txt","r") as textobj:
mylist = list(textobj)
del mylist[0:fin+1]
ok=1
if ok==1:
with open("test.txt", "w") as textobj:
for n in mylist:
textobj.write(n)
print("OLD SIZE = " + str(check))
check = os.stat("test.txt").st_size
print("NEW SIZE = " + str(check) + "\n")
AssembleTC(check)
AssembleTC(initialCall)
Upvotes: 0
Reputation: 77337
Its a large text file so I think it would be worth revisiting the idea of reading it line by line. File objects keep track of where they are in the file and so they can be restarted inside for loops for additional processing. Generators use yield to pass results back to callers and are a good way to encapsulate functionality.
This example scans a file until it sees the ID, gathers lines until it sees the FIN then hands the data back to the caller. Its a generator so it can be called from a for loop to get all of the records in turn.
from __future__ import print_function
import os
import sys
def my_datablock_iter(fileobj):
for line in file:
# find ID
if "@@@ ID:" in line:
# build a list of lines until FIN is seen
wanted = [line.strip()]
for line in file:
line = line.strip()
if line == "@@@TC_FIN@@@":
break
wanted.append(line)
# hand block back to user
yield wanted
with open("file.txt") as fp:
for datablock in my_datablock_iter(fp):
print(datablock)
Upvotes: 2