Alex Reynolds
Alex Reynolds

Reputation: 96976

How to improve performance of File::Find::Rule calls?

I am using File::Find::Rule to locate one-level-deep user-executable folders in a directory specified in $dir:

my @subDirs = File::Find::Rule->permissions(isExecutable => 1, user => "$uid")->
                                extras({ follow => 1, follow_skip => 2 })->
                                directory->
                                maxdepth(1)->
                                in( $dir );

Here is the rough equivalent, using the UNIX find utility:

my $subDirStr = `find $dir -maxdepth 1 -type d -user $username -perm -100`;
chomp($subDirStr); 
my @subDirs = split("\n", $subDirStr);

Both are run in scripts that have permissions to recover this data.

If I run a find statement on the command-line, the results come back instantaneously.

If I run either of the above statements via a Perl script, the results take several seconds to operate.

What can I do programmatically to improve the performance of either of the two Perl approaches?

Upvotes: 9

Views: 1372

Answers (3)

ysth
ysth

Reputation: 98398

I'm going to ignore the File::Find::Rule part for the moment and focus on the difference in find from the command line vs. find from backticks in perl.

First, please verify that a script that does nothing but the find... command still has the problem, run by you as the same user and from and on the same directories as the quickly-running command line invocation.

If it doesn't have the problem, we need to know more about your script. Or you need to remove things from your script piece by piece until you have it down to just doing the find command, and see what needed to be removed to make the problem go away.

If it does, try using a full path (e.g. /usr/bin/find) instead of just find to eliminate the possibility of PATH differences or shell aliases causing a difference.

Also check that the output of the command line run and backticks run are identical.

And try redirecting the output of both to /dev/null (inside the backticks, for the perl version) and see if that makes any difference to the timing.

Upvotes: 3

mishigas
mishigas

Reputation: 11

You must realize that calling commands via perl via backticks or system() causes perl to fork off a shell which then runs the desired command. This will always be slower, though on fast systems with idle resource, it may not be very noticeable.

Upvotes: 0

Grant McLean
Grant McLean

Reputation: 7008

I suspect that the delay you are seeing is due to the length of time it takes to produce all the results. Sure, if you pipe your find command into less, you get results immediately, but if you pipe it into tail you might see a delay similar to what you see with your Perl script.

In both your alternative implementations, you are creating an array with a list of all matching files - your code will not continue on until the file matching process is complete.

You could alternatively use an iterator approach like this:

my $rule = File::Find::Rule->permissions(isExecutable => 1, user => $uid)
                           ->extras({ follow => 1, follow_skip => 2 })
                           ->directory
                           ->maxdepth(1)
                           ->start($dir);
while( defined ( my $path = $rule->match ) ) {
    ...
}

For completeness, you could achieve a similar result with the find command. Instead of using backticks, you could explicitly use a pipe and read results one at a time:

open my $pipe, 'find $dir -maxdepth 1 -type d -user $username -perm -100|' or die "Can't run find: $!";
while(my $path = <$pipe>) {
    ...
}

Note that with both these examples, your code can start processing results as soon as the first match is found. However, the total time taken until the last result is processed shouldn't be much different to your original code.

Upvotes: 5

Related Questions