Cliffordlife
Cliffordlife

Reputation: 470

Duplicate filename protection by incrementation

The issue was saving file uploads locally, and trying to find a nice way to handle duplicate file names.

Upvotes: 2

Views: 3616

Answers (3)

Alain Duchesneau
Alain Duchesneau

Reputation: 394

I've made my own solution. Here it is:

function recursive_increment_filename ($path, $filename)
    {
        $test = "{$path}/{$filename}";
        if (!is_file($test)) return $test;

        $file_info = pathinfo($filename);
        $part_filename = $file_info['filename'];

        if (preg_match ('/(.*)_(\d+)$/', $part_filename, $matches))
        {
            $num = (int)$matches[2] +1;
            $part_filename = $matches[1];
        }
        else
        {
            $num = 1;
        }
        $filename = $part_filename.'_'.$num;

        if (array_key_exists('extension', $file_info))
        {
            $filename .= '.'.$file_info['extension'];
        }

        return recursive_increment_filename($path, $filename);
    }

$url   = realpath(dirname(__FILE__));

$file  = 'test.html';
$fn = recursive_increment_filename($url, $file);

echo $fn;

Upvotes: 0

Mandar
Mandar

Reputation: 11

Best solution is just attach Time Stamp in form of YYYYDDMMHHMMSS , You won't get conflicts throughout your whole life ;) Also its Time complexity is very less. Another thing you can do .. you might skip name check directly and instead with file's name ex. "1.jpg" if you are uploading just attach 1(timestamp).jpg , so that you don't even need to iterate through file system. hope it helps

ex. in PHP

 $timestamp=date("YmdGis"); 
it will generate something like
20111122193631
;)

Upvotes: 1

Fred Foo
Fred Foo

Reputation: 363607

This algorithm is not scalable. Uploading n files with the same name will cause O(n) behavior in this algorithm, leading to O(n²) total running time, including O(n²) filesystem accesses. That's not pretty for a server app. It also can't be fixed because of how filesystems work.

Better solutions:

  1. Store filenames that have already been used in a DB table, mapping them to their use count.
  2. Put a high-granularity timestamp in the filename.
  3. Use the SHA1 (or MD5) hash of the contents as the filename. This also prevents duplicate files being uploaded, if that's important.

Use a database to map filenames back to human-readable names, if necessary.

Upvotes: 3

Related Questions