gwcoffey
gwcoffey

Reputation: 5921

Creating thread-safe non-deleting unique filenames in ruby/rails

I'm building a bulk-file-uploader. Multiple files are uploaded in individual requests, and my UI provides progress and success/fail. Then, once all files are complete, a final request processes/finalizes them. For this to work, I need to create many temporary files that live longer than a single request. Of course I also need to guarantee filenames are unique across app instances.

Normally I would use Tempfile for easy unique filenames, but in this case it won't work because the files need to stick around until another request comes in to further process them. Tempfile auto-unlinks files when they're closed and garbage collected.

An earlier question here suggests using Dir::Tmpname.make_tmpname but this seems to be undocumented and I don't see how it is thread/multiprocess safe. Is it guaranteed to be so?

In c I would open the file O_EXCL which will fail if the file exists. I could then keep trying until I successfully get a handle on a file with a truly unique name. But ruby's File.open doesn't seem to have an "exclusive" option of any kind. If the file I'm opening already exists, I have to either append to it, open for writing at the end, or empty it.

Is there a "right" way to do this in ruby?

I have worked out a method that I think is safe, but is seems overly complex:

# make a unique filename
time = Time.now
filename = "#{time.to_i}-#{sprintf('%06d', time.usec)}"

# make tempfiles (this is gauranteed to find a unique creatable name)
data_file = Tempfile.new(["upload", ".data"], UPLOAD_BASE)

# but the file will be deleted automatically, which we don't want, so now link it in a stable location
count = 1
loop do
   begin
      # File.link will raise an exception if the destination path exists
      File.link(data_file.path, File.join(UPLOAD_BASE, "#{filename}-#{count}.data"))
      # so here we know we created a file successfully and nobody else will take it
      break
   rescue Errno::EEXIST
      count += 1
   end
end

# now unlink the original tempfiles (they're still writeable until they're closed)
data_file.unlink

# ... write to data_file and close it ...

NOTE: This won't work on Windows. Not a problem for me, but reader beware.

In my testing this works reliably. But again, is there a more straightforward way?

Upvotes: 3

Views: 407

Answers (2)

gwcoffey
gwcoffey

Reputation: 5921

I actually found the answer after some digging. Of course the obvious approach is to see what Tempfile itself does. I just assumed it was native code, but it is not. The source for 1.8.7 can be found here for instance.

As you can see, Tempfile uses an apparently undocumented file mode of File::EXCL. So my code can be simplified substantially:

# make a unique filename
time = Time.now
filename = "#{time.to_i}-#{sprintf('%06d', time.usec)}"

data_file = nil
count = 1
loop do
   begin
      data_file = File.open(File.join(UPLOAD_BASE, "#{filename}-#{count}.data"), File::RDWR|File::CREAT|File::EXCL)
      break
   rescue Errno::EEXIST
      count += 1
   end
end

# ... write to data_file and close it ...

UPDATE And now I see that this is covered in a prior thread:

How do open a file for writing only if it doesn't already exist in ruby

So maybe this whole question should be marked a duplicate.

Upvotes: 1

Brad Werth
Brad Werth

Reputation: 17647

I would use SecureRandom.

Maybe something like:

p SecureRandom.uuid #=> "2d931510-d99f-494a-8c67-87feb05e1594"

or

p SecureRandom.hex #=> "eb693ec8252cd630102fd0d0fb7c3485"

You can specify the length, and count on an almost impossibly small chance of collision.

Upvotes: 3

Related Questions