Reputation: 41
There is in one directory where every time new files are generated, like some log files.
My purpose is to get an amount of file generated during 10 mins. To get such value real time.data is as follow:
00:00 ~ 00:10 10 files
00:10 ~ 00:20 23 files
...
23:50 ~ 23:59 12 files
So my idea is to run statistics script every 10 mins by crontab task on Linux system.
Logic the 1st time run script: get current file list by glob.glob("*")
.
Let me say A, so when script run next time (after 10 mins), it will run glob
again to get current file list B. I need different value which in B. no A. so I can get amount.
How to do? If you have another good way, please share.
Upvotes: 0
Views: 92
Reputation: 123453
As I commented on @tcaswell's answer, using Python's built-in set class is an excellent way to solve a problem like this. Here's some sample code loosely based on Tim Golden's Python Stuff article Watch a Directory for Changes:
import os
firstime = False
path_to_watch = '.'
try:
with open('filelist.txt', 'rt') as filelist:
before = set(line.strip() for line in filelist)
except IOError:
before = set(os.listdir(path_to_watch))
firstime = True
if firstime:
after = before
else:
after = set(os.listdir(path_to_watch))
added = after-before
removed = before-after
if added:
print 'Added: ', ', '.join(added)
if removed:
print 'Removed: ', ', '.join(removed)
# replace/create filelist
with open('filelist.txt', 'wt') as filelist:
filelist.write('\n'.join(after) + '\n')
Upvotes: 0