fabrizio
fabrizio

Reputation: 43

How to determine if two handles refer to the same file

I would like to check if two file handles refer to the same file. In order to do this, can I use the stat function applied to each file handle? Thanks in advance

my $file = 'C:\temp\file.txt');
open( TXT1, "> $file" );
open( TXT2, "> $file" );
print( "The handles refer to the same file!") if (\TXT1 eq \TXT2);

Upvotes: 3

Views: 378

Answers (2)

zdim
zdim

Reputation: 66899

That can't be done in general (portably), what is understandable given that a "file"handle need not be associated with a file at all. One thing you can do is to record the fileno for each filehandle.

So when opening a file

my %filename_fileno;

open my $fh, '>', $file or die "Can't open $file: $!";

$filename_fileno{fileno $fh} = $file;

and then you can look it up when needed

say "Filename is: ", $filename_fileno{fileno $fh};

Don't forget to remove the entry from the hash when that file is (to be) closed

delete $filename_fileno{fileno $fh};
close $fh;

So these should be in utility functions. Given that more care is needed, as outlined in the footnote , altogether this would make for a nice little module. Then one can also consider to extend (inherit from) a related module, like Path::Tiny.

Note: You cannot write to a file from separate filehandles like in the question. Operations on each filehandle keep track of where that filehandle was last in the file, thus writes will clobber intermediate writes by the other filehandle.

Note: Use lexical filehandles (my $fh) and not globs (FH), use the three-argument open, and always check the open call.


  On some (most?) Linux systems you can use /proc filesystem

say readlink("/proc/$$/fd/" . fileno $fh);

and on more (all?) Unix-y systems can use the (device and) inode number

say for (stat $fh)[0,1];

  Links, both soft (symbolic) and hard, can be used to change the data and have different names. So we can have different filenames but same "file" (data).

On Windows systems the best way to check is given in this post, except for the hardlink case for which one would have to use the other answer's method (parse output), as far as I can tell.

Also, non-canonical names, as well as different capitalizations (on case insensitive systems), short/long names on some systems, (more?) ... can make for different names for the same file. This is easier to clean up, using modules, but needs to be added as well.

On most (all?) other systems the notion of inode and any available stat-like functionality makes these a non-issue, since device+inode refers uniquely to data.

Thanks to ikegami for comments on this.

Upvotes: 4

ikegami
ikegami

Reputation: 386396

Yes, on some system+device combinations, stat can be used.

use File::stat;

my $st1 = stat($fh1)
   or die $!;
my $st2 = stat($fh2)
   or die $!;

say $st1->dev == $st2->dev && $st1->ino == $st2->ino ? "same" : "different";

Notably, this won't work on NTFS or FAT32, one of which you appear to be using.

You should have your program keep track of that itself, but that's easier said than done. It's rather hard to identify if two paths refer to the same file when you have to contend with hard links, soft links, paths with ./.., capitalization/slash differences, short/long names, conventional/UNC paths, shares, etc.

For example, all of the following paths could refer to the same file:

  • C:\Moo\Foo Bar.txt
  • C:\Hardlink\Foo Bar.txt
  • C:\Softlink\Foo Bar.txt
  • C:\.\Moo\Foo Bar.txt
  • C:\Baz\..\Moo\Foo Bar.txt
  • c:\moo\foo bar.txt
  • C:/Moo/Foo Bar.txt
  • C:\Moo\FOOBAR~1.TXT
  • \\?\C:\Moo\Foo Bar.txt
  • \\127.0.0.1\C$\Moo\Foo Bar.txt
  • Z:\Moo\Foo Bar.txt

All but the last two can be handled via normalization.

Upvotes: 2

Related Questions