roovalk
roovalk

Reputation: 111

Perl: strange behaviour of glob with files greater than 2 GB

I'm simply trying to get a list of filenames given a path with wildcard.

my $path = "/foo/bar/*/*.txt";
my @file_list = glob($path);
foreach $current_file (@file_list) {
   print "\n- $current_file";
}

Mostly this works perfectly, but if there's a file greater than 2GB, somewhere in one of the /foo/bar/* subpaths, the glob returns an empty array without any error or warning.

If I remove the file file or add a character/bracket sequence like this:

my $path = "/foo/bar/*[0-9]/*.txt";

or

my $path = "/foo/bar/*1/*.txt";

then the glob works again.

UPDATE:

Here's an example (for a matter of business policy I had to mask the pathname):

[root]/foo/bar # ls -lrt
drwxr-xr-x    2 root     system         256 Oct 11 2006  lost+found
drwxr-xr-x    2 root     system         256 Dec 27 2007  abc***
drwxr-xr-x    2 root     system         256 Nov 12 15:32 cde***
-rw-r--r--    1 root     system  2734193149 Nov 15 05:07 archive1.tar.gz
-rw-r--r--    1 root     system     6913743 Nov 16 05:05 archive2.tar.gz
drwxr-xr-x    2 root     system         256 Nov 16 10:00 fgh***
[root]/foo/bar # /home/user/test.pl
[root]/foo/bar #

Removing the >2GB file (or globbing with "/foo/bar/[acf]/" istead of "/foo/bar//")

[root]/foo/bar # ls -lrt
drwxr-xr-x    2 root     system         256 Oct 11 2006  lost+found
drwxr-xr-x    2 root     system         256 Dec 27 2007  abc***
drwxr-xr-x    2 root     system         256 Nov 12 15:32 cde***
-rw-r--r--    1 root     system     6913743 Nov 16 05:05 archive2.tar.gz
drwxr-xr-x    2 root     system         256 Nov 16 10:00 fgh***

[root]/foo/bar # /home/user/test.pl
- /foo/bar/abc***/heapdump.phd.gz
- /foo/bar/cde***/javacore.txt.gz
- /foo/bar/fgh***/stuff.txt
[root]/foo/bar #

Any suggestion?

I'm working with: Perl 5.8.8 Aix 5.3 The filesystem is a local jfs.

Upvotes: 11

Views: 442

Answers (2)

Richard Huxton
Richard Huxton

Reputation: 22943

In the absence of a proper answer you're going to want a work-around. I'm guessing you've hit some platform-specific bug in the glob() implementation of 5.8.8

I had a quick look at the source on CPAN but my C is too rusty to spot anything useful.

There have been lots of changes to that module though, so a bug may well have been reported and fixed. You're not even on the last release of 5.8 - there's a 5.8.9 out there which mentions updates to AIX compatibility and File::Glob.

I'd test this by installing local::lib if you haven't already and then perhaps cpanm and try updating File::Glob - see what that does. You might need to download the files by hand from e.g. here

If that solves the problem then you can either deploy updates to the required systems, or you'll have to re-implement the bits of glob() you want. Which is going to depend on how complex your patterns get.

If it doesn't solve the problem then at least you'll be able to stick some printf's into the code and see what it's doing.

Hopefully someone will post a real answer and make this redundant about 5 minutes after I click "Post Your Answer" though.

Upvotes: 4

Flow
Flow

Reputation: 35

I've never used the new Glob function before, so i cant comment on benefits/problems, but it seems quite a lot of people have had issues using it: see => https://stackoverflow.com/search?q=perl+glob&submit=search for some questions and possible solutions.

IF you don't mind trying out something else: Here is my tried and tested 'old school' perl solution i have used in countless projects:

my $path = "/foo/bar/";
my @result_array = qx(find $path -iname '*.txt'); #run the system find command

If you - for whatever reason prefer not to run a system command from within your script, then lookup the built in Find::Perl Module instead: http://search.cpan.org/~dom/perl-5.12.5/lib/File/Find.pm

good luck

Upvotes: -3

Related Questions