Reputation: 4486
I need to detect when either of two file types are accessed in any way across an entire windows file system.
As I understand it the only way to do this without causing serious slow downs for the operating system is to create a file system filter driver?
Essentially all I need to do is take a copy of any doc(x) files and pdf's that are opened. I decided on this approach as it was either that or use file monitors in C# which wouldn't be effective for an entire drive.
My question is two fold, is there an easier way and secondly how would I go about simply taking a copy of each doc(x)/pdf file as it's accessed?
The solution needs to be deployable with the package we're currently producing.
UPDATE
I'm going to benchmark the file system watcher, after discussing it with people here I think it's possible that it may be acceptable, my concern is the fact that I need to monitor the common user directories where downloads will occur( so "C:\Users\SomeUser*" as well as the outlook temporary folder.
Upvotes: 4
Views: 2251
Reputation: 59303
From what I read in the comments, a File System Watcher would probably work well. I am not exactly sure whether Search Everything uses one, but if it does, I cannot notice any impact.
Another option might be ETW - Windows Event Tracing as used by Process Monitor. Even with millions of changes, I can also hardly notice the impact.
I you want to go for Volume Shadow Copies as proposed by Hans Passant, Alpha Volume Shadow Copies might be a suitable library offering support for it.
Conclusion: a filter driver is probably not needed and keeps you away from other problems, although I admit that the description of hierarchical storage management systems might match your approach, thinking of the upload store as the next hierarchy after hard disk.
Upvotes: 1
Reputation: 14173
I think that creating a copy on read will cause a lot of problems. For instance: virus scanners. Consider the following:
Now you ofcourse you could create copies with a different extension to prevent this, but still there will be a lot of READ
actions on files. I sometimes open a file like 10 times, just because I closed it accidentally or I want to recheck something I just read. Now you'll have 10 copies?
I would definitly go with Hans Passant's suggestion of creating a copy on change/create. That happens a lot less by definition, because you always need to open it to alter it, but don't have to alter it when you open it.
The second problem would be to detect a read to a file. Now with docx
you could check for the creation of hidden files like '~$_____.docx', but that doesn't work for PDF
. Also like you mentioned, you will have to check an entire disk. There is no way around it. If a file can be in any folder, you'll have to check all the folders. Creating an internal list of docx
and PDF
files in a service could be faster, but as you'll have to loop trough each file again at set intervals it depends on how many files are on the system.
So if you really need to check read access
, a file system driver is all you got. But since it will be called on every file access, causing problems or slow systems would be a mayor concern.
If you still want to, check out this File System Filter Driver Tutorial to learn how to do it. Personally, I wouldn't go there.
Upvotes: 2
Reputation: 644
You will need to create a file system watcher. Here is a code example that will watch for changes to docx files.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Security.Permissions;
namespace filewatchtest
{
class Program
{
static void Main(string[] args)
{
Run();
}
[PermissionSet(SecurityAction.Demand, Name="FullTrust")]
public static void Run()
{
string[] args = System.Environment.GetCommandLineArgs();
// if directory not specified then end program
if (args.Length != 2)
{
Console.WriteLine("Usage: filewatchtest.exe directory");
return;
}
// create a new fileSystemWatcher and set its properties
FileSystemWatcher watcher = new FileSystemWatcher();
watcher.Path = args[1];
// set the notify filters
watcher.NotifyFilter = NotifyFilters.LastAccess | NotifyFilters.LastWrite | NotifyFilters.FileName | NotifyFilters.DirectoryName;
// set the file extension filter
watcher.Filter = "*.docx";
// add event handlers
watcher.Changed += new FileSystemEventHandler(OnChanged);
watcher.Created += new FileSystemEventHandler(OnChanged);
watcher.Deleted += new FileSystemEventHandler(OnChanged);
watcher.Renamed += new RenamedEventHandler(OnRenamed);
// bengin watching
watcher.EnableRaisingEvents = true;
// wait for the user to quit the program
Console.WriteLine("Plress q to quit the program");
while (Console.Read()!='q');
}
static void OnRenamed(object sender, RenamedEventArgs e)
{
Console.WriteLine("File: {0} renamed to {1}", e.OldFullPath, e.FullPath);
}
static void OnChanged(object sender, FileSystemEventArgs e)
{
Console.WriteLine("File:" + e.FullPath + " " + e.ChangeType);
}
}
}
Upvotes: 2