Reputation: 2835
I'm a Linux SysAdmin, and always work in Bash. I am very experienced administrating PHP, but fairly new to writing PHP scripts.
What reasons would I want to get output from a system command, rather than PHP internal functions. I would assume you have portability concerns, although that is not a problem in my case.
For example, I could get a directory listing with ls, and an exec style function. or do something with opendir('.'), and a loop.
What are the performance considerations?
Here is what I have now,
<?php
// Directory Open
$myDirectory = opendir('/media/ARCHIVE/Documents/conf/web_confs');
// Array of Elements
while($entryName = readdir($myDirectory)) {
$dirArray[] = $entryName;
}
// Directory Close
closedir($myDirectory);
// Sort Array
sort($dirArray);
?>
Is there a more effeciant way to replace that loop? Performance does matter in my case, these will be made into snippet like things.
this is system related code, that will not be taking any type of input. Not even from a the URL. In addition the environment is of large scale, and all the systems are performance tuned and often profiled.
So doing it the best way possible once is what i'm looking for. I really would appreciate any input, as my experience is somewhat limited.
Upvotes: 2
Views: 1002
Reputation: 21
TechZilla - it sounds like you and YCS have a misunderstanding. I'd ignore it.
I have 2 hopefully useful inputs:
1. My most reliable hunch is stick with the PHP calls rather than forking off another process:
Forking a new process does all sorts of fun things: file handle and state copies, backup of the entire stack, backup of register state, relocation of memory locations, loading the new executable from disk, loading each of its shared libraries that aren't already loaded, jumping of the instruction and data registers literally thousands of times (which means lots of likely cpu memory cache misses!) for memory location patching, a huge headache.
A function call (even to a C binding) is just a bit of extra space on the stack and a few instruction pointer jumps. The very first time your app loads it will deal with loading up any dependent libaries (headache defined above) but those stay in memory. Not near as big of a big deal. You can't do the same thing with a whole program.
2. Ajreal's memcache suggestion sounds like a good 3rd solution
You can always use inotify to let a script know when an important directory has been modified and touch (not necessarily the touch tool) a db (even flat file or memcache) letting you know it's been changed.
All the code checking on directory contents can check the cache unless otherwise notified. The Linux file system cache will make checks on your mutex file very efficient: memory-only, no hit to disk unless it's been changed.
Upvotes: 0
Reputation: 157989
What reasons would I want to get output from a system command, rather than PHP internal functions
You named them yourself - you are familiar with them.
That's enough
As of performance it doesn't matter.
Look, you are concering about the smallest part of the system.
To process a user request, a web-server will do hundreds of such disk lookups. Why them aren't your concern? But you want to super-duper-extra-micro optimize only this one.
And you even dont know if it's required or not.
There is no logic in this.
Upvotes: -1
Reputation: 47331
Execute a bash command generally will be slower.
Two situations you might require calling an external bash command
When PHP don't have the built-in libraries support (or require to install a PECL package). Such as rsync, wget, seige, mysqldump, ab (apache benchmark). This is the time, you will find executing an external command is easier
A bash script can get better result than PHP built-in function, such as your example. Or another example :- du -h $dir | sort -nr
.
As the rest already indicate, if performance is a concern,do benchmark and compare yourself.
If security is concern, executing external bash command is higher risk.
If the scan dir really a bottle neck,
you should consider to cache into storage that allow fast access, such as APC, memcache.
You can easily create a hook / trigger to expire the cache where there is write activities on the directories.
An example
if ( cache_found())
{
// return results from cache
}
else
{
// scan dir
// set the scan result into cache
// return scan result;
}
With this typical hit-or-miss technique, your performance can get a pretty good boost.
Upvotes: 1
Reputation: 1553
I wouldn't really say PHP functions are slower as they mostly are mapped to underlying C library functions. Custom executables are a different story of course.
Using exec means Forking the process with copying the whole process memory ( skipping vfork here, I know ) and replacing the process memory with the new executable. The latter is actually quite fast. I would say normal C2D system could handle a few thousand exec's a second.
Upvotes: 1
Reputation: 44344
For your example of reading a directory, doing it in the PHP is certainly going to be faster than having the OS load up process.
As for security: PHP's built in functions will have less security concerns that launching a process through a shell. (Such as PHP's system
function.) Starting a shell is especially risky business if you use user's input in the command to the shell: you must make sure that no matter what the user might input, you'll know what the shell command you construct will do! (This similar to SQL injection, if you're familiar with that.) PHP provides functions for this, and the manual pages for system
and exec
note them.
That said, reading a directory using user's input also has security considerations (in the general case). Always make sure that you sanitize the user's input as appropriate for your case. If they can read any directory, opendir($the_users_input)
is probably OK. If they should only be able to read anything in /foo/bar
, then you likely need to make sure they're not passing you ../../etc/passwd
.
Upvotes: 2
Reputation: 38740
For this specific case there is scandir (http://www.php.net/manual/en/function.scandir.php), which I would use.
It's probably faster to use PHP's built-in functions to do these things as I guess doing an exec() has some overhead. Honestly I don't think you'll have a performance problem unless you are getting directory listings thousands of times in a single script. Also, benchmark it if you really care, but I think you'll be doing micro-optimizations that don't make a significant improvement. (Premature optimization is the root of all evil!)
You'll have security issues if you take user input and don't sanitize it. I would be extremely careful at sanitizing input that's then passed to exec().
Upvotes: 4