Reputation: 1763
I'm working on a Java project where I need to monitor files in a certain directory and be notified whenever changes are made on one of the files, this can be achieved using WatchService
. Furthermore, I want to know what changes were made, for example: "characters 10 to 15 where removed", "at index 13 characters 'abcd' were added"... I'm willing to take any solution even based on c language monitiring the fileSystem.
I also want to avoid the diff solution to avoid storing the same file 2 times, and for the complexity of the algorithm, it takes to much time for big files.
Thank you for help. :)
Upvotes: 0
Views: 257
Reputation: 2868
If you're using Linux, then the following code will detect changes in file length, you can easily extend this to update modifications.
Because you don't want to keep two files, there is no way to tell which characters were altered if either the file length is reduced (lost characters can't be found) or The file was altered somewhere in the middle
#include <stdio.h>
#include <stdint.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main(int argc, char** argv)
{
int fd = open("test", O_RDONLY);
int length = lseek(fd, 0, SEEK_END);
while (1)
{
int new_length;
close(fd);
open("test", O_RDONLY);
sleep(1);
new_length = lseek(fd, 0, SEEK_END);
printf("new_length = %d\n", new_length);
if (new_length != length)
printf ("Length changed! %d->%d\n", length, new_length);
length=new_length;
}
}
[EDIT]
Since the author accepts changes to the kernel for this task, the following change to vfs_write should do the trick:
#define MAX_DIFF_LENGTH 128
ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
{
char old_content[MAX_DIFF_LENGTH+1];
char new_content[MAX_DIFF_LENGTH+1];
ssize_t ret;
if (!(file->f_mode & FMODE_WRITE))
return -EBADF;
if (!file->f_op || (!file->f_op->write && !file->f_op->aio_write))
return -EINVAL;
if (unlikely(!access_ok(VERIFY_READ, buf, count)))
return -EFAULT;
ret = rw_verify_area(WRITE, file, pos, count);
if (___ishay < 20)
{
int i;
int length = count > MAX_DIFF_LENGTH ? MAX_DIFF_LENGTH : count;
___ishay++;
vfs_read(file, old_content, length, pos);
old_content[length] = 0;
new_content[length] = 0;
memcpy(new_content, buf, length);
printk(KERN_ERR"[___ISHAY___]Write request for file named: %s count: %d pos: %lld:\n",
file->f_path.dentry->d_name.name,
count,
*pos);
printk(KERN_ERR"[___ISHAY___]New content (replacement) <%d>:\n", length);
for (i=0;i<length;i++)
{
printk("[0x%02x] (%c)", new_content[i], (new_content[i] > 32 && new_content[i] < 127) ?
new_content[i] : 46);
if (length+1 % 10 == 0)
printk("\n");
}
printk(KERN_ERR"[___ISHAY___]Old content (on file now):\n");
for (i=0;i<length;i++)
{
printk("[0x%02x] (%c)", old_content[i], (old_content[i] > 32 && old_content[i] < 127) ?
old_content[i] : 46);
if (length+1 % 10 == 0)
printk("\n");
}
}
if (ret >= 0) {
count = ret;
if (file->f_op->write)
ret = file->f_op->write(file, buf, count, pos);
else
ret = do_sync_write(file, buf, count, pos);
if (ret > 0) {
fsnotify_modify(file);
add_wchar(current, ret);
}
inc_syscw(current);
}
return ret;
}
Explanation:
vfs_write is the function that handles write requests for files, so that's our best central hook to catch modification requests for files before they occur.
vfs_write accepts the file, file position, buffer and length for the write operation, so we know what part of the file will be replaced by this write, and what data will replace it.
Since we know what part of the file will be altered, I added the vfs_read call just before the actual write to keep in memory the part of file we are about to overrun.
This should be a good starter point to get what you need, I made the following simplifications as this is only an example:
[EDIT2]
Forgot to mention where this function is located, its under fs/read_write.c
in the kernel tree
[EDIT3]
There's another possible solution, providing you know which program you want to monitor, and that it doesn't have libc linked statically is use LD_PRELOAD to override the write
function and use that as your hook and record the changes. I haven't tried this, but there's no reason why it shouldn't work
Upvotes: 2