Andreas
Andreas

Reputation: 5301

How to pipe to and read from the same tempfile handle without race conditions?

Was debugging a perl script for the first time in my life and came over this:

$my_temp_file = File::Temp->tmpnam();
system("cmd $blah | cmd2 > $my_temp_file");
open(FIL, "$my_temp_file");
...
unlink $my_temp_file;

This works pretty much like I want, except the obvious race conditions in lines 1-3. Even if using proper tempfile() there is no way (I can think of) to ensure that the file streamed to at line 2 is the same opened at line 3. One solution might be pipes, but the errors during cmd might occur late because of limited pipe buffering, and that would complicate my error handling (I think).

How do I:

  1. Write all output from cmd $blah | cmd2 into a tempfile opened file handle?
  2. Read the output without re-opening the file (risking race condition)?

Upvotes: 3

Views: 467

Answers (2)

brian d foy
brian d foy

Reputation: 132822

You can open a pipe to a command and read its contents directly with no intermediate file:

open my $fh, '-|', 'cmd', $blah;

while( <$fh> ) {
    ...
    }

With short output, backticks might do the job, although in this case you have to be more careful to scrub the inputs so they aren't misinterpreted by the shell:

my $output = `cmd $blah`;

There are various modules on CPAN that handle this sort of thing, too.

Some comments on temporary files

The comments mentioned race conditions, so I thought I'd write a few things for those wondering what people are talking about.

In the original code, Andreas uses File::Temp, a module from the Perl Standard Library. However, they use the tmpnam POSIX-like call, which has this caveat in the docs:

Implementations of mktemp(), tmpnam(), and tempnam() are provided, but should be used with caution since they return only a filename that was valid when function was called, so cannot guarantee that the file will not exist by the time the caller opens the filename.

This is discouraged and was removed for Perl v5.22's POSIX.

That is, you get back the name of a file that does not exist yet. After you get the name, you don't know if that filename was made by another program. And, that unlink later can cause problems for one of the programs.

The "race condition" comes in when two programs that probably don't know about each other try to do the same thing as roughly the same time. Your program tries to make a temporary file named "foo", and so does some other program. They both might see at the same time that a file named "foo" does not exist, then try to create it. They both might succeed, and as they both write to it, they might interleave or overwrite the other's output. Then, one of those programs think it is done and calls unlink. Now the other program wonders what happened.

In the malicious exploit case, some bad actor knows a temporary file will show up, so it recognizes a new file and gets in there to read or write data.

But this can also happen within the same program. Two or more versions of the same program run at the same time and try to do the same thing. With randomized filenames, it is probably exceedingly rare that two running programs will choose the same name at the same time. However, we don't care how rare something is; we care how devastating the consequences are should it happen. And, rare is much more frequent than never.

File::Temp

Knowing all that, File::Temp handles the details of ensuring that you get a filehandle:

my( $fh, $name ) = File::Temp->tempfile;

This uses a default template to create the name. When the filehandle goes out of scope, File::Temp also cleans up the mess.

{
my( $fh, $name ) = File::Temp->tempfile;
print $fh ...;
...;
}  # file cleaned up

Some systems might automatically clean up temp files, although I haven't care about that in years. Typically is was a batch thing (say once a week).

I often go one step further by giving my temporary filenames a template, where the Xs are literal characters the module recognizes and fills in with randomized characters:

my( $name, $fh ) = File::Temp->tempfile( 
    sprintf "$0-%d-XXXXXX", time );

I'm often doing this while I'm developing things so I can watch the program make the files (and in which order) and see what's in them. In production I probably want to obscure the source program name ($0) and the time; I don't want to make it easier to guess who's making which file.

A scratchpad

I can also open a temporary file with open by not giving it a filename. This is useful when you want to collect outside the program. Opening it read-write means you can output some stuff then move around that file (we show a fixed-length record example in Learning Perl):

open(my $tmp, "+>", undef) or die ...

print $tmp "Some stuff\n";
seek $tmp, 0, 0;
my $line = <$tmp>;

Upvotes: 4

lordadmira
lordadmira

Reputation: 1832

File::Temp opens the temp file in O_RDWR mode so all you have to do is use that one file handle for both reading and writing, even from external programs. The returned file handle is overloaded so that it stringifies to the temp file name so you can pass that to the external program. If that is dangerous for your purpose you can get the fileno() and redirect to /dev/fd/<fileno> instead.

All you have to do is mind your seeks and tells. :-) Just remember to always set autoflush!

use File::Temp;
use Data::Dump;

$fh = File::Temp->new;
$fh->autoflush;

system "ls /tmp/*.txt >> $fh" and die $!;

@lines = <$fh>;
printf "%s\n\n", Data::Dump::pp(\@lines);

print $fh "How now brown cow\n";

seek $fh, 0, 0 or die $!;
@lines2 = <$fh>;
printf "%s\n", Data::Dump::pp(\@lines2);

Which prints

[
  "/tmp/cpan_htmlconvert_DPzx.txt\n",
  "/tmp/cpan_htmlconvert_DunL.txt\n",
  "/tmp/cpan_install_HfUe.txt\n",
  "/tmp/cpan_install_XbD6.txt\n",
  "/tmp/cpan_install_yzs9.txt\n",
]

[
  "/tmp/cpan_htmlconvert_DPzx.txt\n",
  "/tmp/cpan_htmlconvert_DunL.txt\n",
  "/tmp/cpan_install_HfUe.txt\n",
  "/tmp/cpan_install_XbD6.txt\n",
  "/tmp/cpan_install_yzs9.txt\n",
  "How now brown cow\n",
]

HTH

Upvotes: 2

Related Questions