Subhayan Bhattacharya
Subhayan Bhattacharya

Reputation: 5725

Reading entire file data from a constantly updated file in Python

Lets say i have a log file : output.log and it is constantly updated by a separate process lets say a Java code somewhere in the system.

Now i have a separate Python process which reads the log file for parsing it and finding out some data . I am using dead simple Python code to do the same:

with open('output.log') as f:
    for line in f:
        # Do something with that line#

The issue is i don't know how frequently the file gets updated. How does the Python figure out when to stop if it is a constantly updated file.

Should not the program just hang waiting for data infinitely?

Thanks in advance for any answers.

Upvotes: 0

Views: 1415

Answers (3)

Chris
Chris

Reputation: 338

The for loop will read until it hits the current end of the file and then terminate. Maybe do something like this:

#!/usr/bin/env python                                                           
import os                                                                       
import sys                                                                      
import time                                                                     


def process_line(line):                                                         
    print(line.rstrip("\n"))                                                    


def process_file(f):                                                            
    for line in f:                                                              
        process_line(line)                                                      


def tail(path):                                                                 
    old_size = 0                                                                
    pos = 0                                                                     
    while True:                                                                 
        new_size = os.stat(path).st_size                                        
        if new_size > old_size:                                                 
            with open(path, "U") as f:                                          
                f.seek(pos)                                                     
                process_file(f)                                                 
                pos = f.tell()                                                  
            old_size = new_size                                                 
        time.sleep(1)                                                           


if __name__ == "__main__":                                                      
    tail(sys.argv[1])

Of course, this assumes the file doesn't roll and get its size reset to zero.

Upvotes: 0

Shubham Jain
Shubham Jain

Reputation: 5536

Here Generators can be of great help.

# follow.py
#
# Follow a file like tail -f.

import time
import os

def follow(thefile):
    thefile.seek(0, os.SEEK_END)
    while True:
        line = thefile.readline()
        if not line:
            time.sleep(0.1)
            continue
        yield line

# Example use


if __name__ == '__main__':
    logfile = open("run/foo/access-log","r")
    loglines = follow(logfile)
    for line in loglines:
        print(line, end='')

To stop parsing the log file continuously just give a break in the last for loop and you are good to go.

You can perform any operation over the parsed input data in the last for loop.

To get familiar with generators more I would suggest to read Generator Tricks for Systems Programmers

Upvotes: 2

Drastik MoustiK
Drastik MoustiK

Reputation: 7

You should use something based on the tail -f functionnality if you want to keep reading in it.

import time 
while 1: 
  where = file.tell()
  line = file.readline() 
  if not line: 
    time.sleep(1)
    file.seek(where) 
  else: 
    print line, # already has newline 

Exemple taken from here : http://code.activestate.com/recipes/157035-tail-f-in-python/

Upvotes: 1

Related Questions