Brandon Ling
Brandon Ling

Reputation: 3919

Folder recursion: when to return?

What I am trying to do is recursively iterate through all the folders and gathering an $arr from each file that I will later merge. In high level terms this is what I am doing:

function main(){
    $path = ....
    $arr = array();
    if(is_dir($path)){
       $arr = parseFolder($path, $arr);
    } else {
       $arr = parseFile($path);
    }
    print $arr;
}
function parseFile($path){
   ....
   return $arr
}

function parseFolder($path, $arr){
   $folder = opendir($path);
   while($item=readdir($folder)){
      if(is_dir($item)
          parseFolder($path . '/' . $item, $arr);
      else
          $arr = merge_array($arr, parseFile($path . '/' . $item);

   }
   return $arr
}

As you can see there is going to be a problem with this because I don't have a base case. So, it will eventually return $arr multiple times in the parseFolder function. Is there anyway to know when I'm virtually done iterating through all files/folders, so I know when to return my final result $arr? I'm open to more efficient implementations.

Upvotes: 0

Views: 67

Answers (2)

Sylwester
Sylwester

Reputation: 48745

There are some syntax errors in you code. Missing parenthesis several places, you used constant folder instead of variable $folder and I expect merge_array actually is array_merge?

Your solution seems to be a mix of two approaches:

Mutating version where the supplied reference is added to.

function parseFolder($path, &$arr)
{
    $folder = opendir($path);
    while($item=readdir($folder)) {
        // current and parent directory shouldn't be processed
        if( $item == '.' || $item == '..' )
            continue;
        elseif( is_dir($item) )
            parseFolder($path . '/' . $item, $arr);
        else
            $arr = array_merge($arr, parseFile($path . '/' . $item));
    }
    // the supplied array is updated with the result
}

A version that uses the return value of deeper recursion and build by value.

function parseFolder($path)
{
    $arr = array();
    $folder = opendir($path);
    while($item=readdir($folder)) {
        // current and parent directory shouldn't be processed
        if( $item == '.' || $item == '..' ) 
            continue;
        elseif( is_dir($item) )
            $arr = array_merge($arr, parseFolder($path . '/' . $item));
        else
            $arr = array_merge($arr, parseFile($path . '/' . $item));
    }
    return $arr;
}

Upvotes: 0

Sammitch
Sammitch

Reputation: 32232

You need to replace:

parseFolder($path . '/' . $item, $arr);

With

$arr = array_merge($arr, parseFolder($path . '/' . $item, $arr));

Otherwise all of your recursive calls will be thrown away.

Also, I don't think you have a full grasp on recursion as it seems that you think if you return at any point down deep in the recursion it will break all the way back out to the original call. This is incorrect.

The only caveats with recursive functions are:

  1. Make sure you don't infinitely recurse, eg. in this case a symlink to a parent directory.
  2. A sufficiently deep recursion can cause a stack overflow and crash your program.

For #1 you can simply avoid processing symlinks, or to cover both you can implement a depth limit. eg:

<?php
define('RECURSE_MAXDEPTH', 10);

function myRecurse($path, $depth=0) {
  $arr = array();
  $folder = opendir($path);
  while( $item = readdir($folder) ) {
    if( is_dir($item) ) {
      if( $depth < RECURSE_MAXDEPTH ) {
        $arr = array_merge($arr, myRecurse($path.'/'.$item, $depth+1));
      }
    } else {
      $arr = array_merge($arr, someFunction($item));
    }
  }
  return $arr;
}

$myArr = myRecurse('~sammitch/');

Upvotes: 1

Related Questions