Marko
Marko

Reputation: 417

How to start reading from a file at a certain line?

I have a function that reads a large .txt file, line by line.

As parameter I give to the function the line index from where it should start reading in file.

First I call the function with 0 so that it will begin from the start. At the end I call again the function with a new parameter, but when it reenters in the function the fresh sent index (which is different now) is still 0 in the for statement. :(

from __future__ import print_function
import os
import sys

file = open("file.txt").read().splitlines()

for i, line in enumerate(file):
    if file[i] == "@@@TC_FIN@@@":
        fin = i;
        #print (fin)

def AssembleTC(index):

   while index < fin:

       for index, line in enumerate(file):
           if "@@@ ID:" in line:
               print(file[index+1])
               break

       for index, line in enumerate(file):
           if file[index] == "@@@TC_FIN@@@":
               recursive = index;
               #print (recursive)
               break

       AssembleTC(recursive+1)

AssembleTC(0)

It is vital for me to keep the present for statement with file[index] access procedure. I've read that I could skip lines with something like file.next() but it doesn't work.

Is there any way to skip the number of lines that I want or simply to start the new reading from the updated index? Python 2.7.13 - Thank you!

Upvotes: 4

Views: 4197

Answers (2)

Marko
Marko

Reputation: 417

I have implemented my idea by erasing the lines which I've already parsed and it works very well, but this is only my happy case, because I do not need anymore any data which I have manipulated. For those who will still need it, I think @tdelaney code is good to use, answer for which I thank him!

Here is how I did it:

from __future__ import print_function
import os
import sys

initialCall = os.stat("test.txt").st_size

def AssembleTC(parameter):

  print("CALLED PARAMETER = " + str(parameter))
  if parameter == 0:
      sys.exit()
  else:
      file = open("test.txt").read().splitlines()
      for index, line in enumerate(file):
          if file[index] == "@@@TC_FIN@@@":
              fin = index;
              print ("FIN POSITION = " + str(fin))
              break

      check = os.stat("test.txt").st_size
      print("File size = " + str(check))

      while check > 1:
          for index, line in enumerate(file):
              if "@@@ TC NR" in line:
                  print(file[index+1])
                  break
          ok=0
          with open("test.txt","r") as textobj:
              mylist = list(textobj)
              del mylist[0:fin+1]
              ok=1

          if ok==1:    
              with open("test.txt", "w") as textobj:
                  for n in mylist:
                      textobj.write(n)

          print("OLD SIZE = " + str(check))
          check = os.stat("test.txt").st_size
          print("NEW SIZE = " + str(check) + "\n")

          AssembleTC(check)

AssembleTC(initialCall)

Upvotes: 0

tdelaney
tdelaney

Reputation: 77337

Its a large text file so I think it would be worth revisiting the idea of reading it line by line. File objects keep track of where they are in the file and so they can be restarted inside for loops for additional processing. Generators use yield to pass results back to callers and are a good way to encapsulate functionality.

This example scans a file until it sees the ID, gathers lines until it sees the FIN then hands the data back to the caller. Its a generator so it can be called from a for loop to get all of the records in turn.

from __future__ import print_function
import os
import sys

def my_datablock_iter(fileobj):
    for line in file:
        # find ID
        if "@@@ ID:" in line:
            # build a list of lines until FIN is seen
            wanted = [line.strip()]
            for line in file:
                line = line.strip()
                if line == "@@@TC_FIN@@@":
                    break
                wanted.append(line)
            # hand block back to user
            yield wanted

with open("file.txt") as fp:
    for datablock in my_datablock_iter(fp):
        print(datablock)

Upvotes: 2

Related Questions