Reputation: 651
I have a directory with multiple sub-directories in it and each subdir has a fixed set of files - one for each category like -
1)Main_dir
1.1) Subdir1 with files
- Test.1.age.txt
- Test.1.name.txt
- Test.1.place.csv
..........
1.2) Subdir2 with files
- Test.2.age.txt
- Test.2.name.txt
- Test.2.place.csv
.........
there are around 20 folders with 10 files in them. I need to first concatenate files under each category like Test.1.age.txt and Test.2.age.txt into a combined.age.txt file and once I do all concatenation I want to printout these filenames in a new Final_list.txt file like
./Main_dir/Combined.age.txt
./Main_dir/Combined.name.txt
I am able to read all the files from all subdirs in an array, but i am not sure how to do pattern search for the similar files names. Also, will be able to figure out this printout part of the code. Can anyone please share on how to do this pattern search for concatenation? My code so far :
use warnings;
use strict;
use File::Spec;
use Data::Dumper;
use File::Basename;
foreach my $file (@files) {
print "$file\n";
}
my $testdir = './Main_dir';
my @Comp_list = glob("$testdir/test_dir*/*.txt");
I am trying to do the pattern search on the array contents in the @Comp_list, which I surely need to learn -
foreach my $f1 (@Comp_list) {
if($f1 !~ /^(\./\.txt$/) {
print $f1; # check if reading the file right
#push it to a file using concatfile(
}}
Thanks a lot!
Upvotes: 1
Views: 731
Reputation: 126722
This should work for you. I've only tested it superficially as it would take me a while to create some test data, so as you have some at hand I'm hoping you'll report back with any problems
The program segregates all the files found by the equivalent of your glob
call, and puts them in buckets according to their type. I've assumed that the names are exactly as you've shown, so the type is penultimate field when the file name is split on dots; i.e. the type of Test.1.age.txt
is age
Having collected all of the file lists, I've used a technique that is originally designed to read through all of the files specified on the command line. If @ARGV
is set to a list of files then an <ARGV>
operation will read through all the files as if they were one, and so can easily be copied to a new output file
If you need the files concatenated in a specific order then I will have to amend my solution. At present they will be processed in the order that glob
returns them -- probably in lexical order of their file names, but you shouldn't rely on that
use strict;
use warnings 'all';
use v5.14.0; # For autoflush method
use File::Spec::Functions 'catfile';
use constant ROOT_DIR => './Main_dir';
my %files;
my $pattern = catfile(ROOT_DIR, 'test_dir*', '*.txt');
for my $file ( glob $pattern ) {
my @fields = split /\./, $file;
my $type = lc $fields[-2];
push @{ $files{$type} }, $file;
}
STDOUT->autoflush; # Get prompt reports of progress
for my $type ( keys %files ) {
my $outfile = catfile(ROOT_DIR, "Combined.$type.txt");
open my $out_fh, '>', $outfile or die qq{Unable to open "$outfile" for output: $!};
my $files = $files{$type};
printf qq{Writing aggregate file "%s" from %d input file%s ... },
$outfile,
scalar @$files,
@$files == 1 ? '' : 's';
local @ARGV = @$files;
print $out_fh $_ while <ARGV>;
print "complete\n";
}
Upvotes: 3
Reputation: 356
I think it's easier if you categorize the files first then you can work with them.
use warnings;
use strict;
use File::Spec;
use Data::Dumper;
use File::Basename;
my %hash = ();
my $testdir = './main_dir';
my @comp_list = glob("$testdir/**/*.txt");
foreach my $file (@comp_list){
$file =~ /(\w+\.\d\..+\.txt)/;
next if not defined $1;
my @tmp = split(/\./, $1);
if (not defined $hash{$tmp[-2]}) {
$hash{$tmp[-2]} = [$file];
}else{
push($hash{$tmp[-2]}, $file);
}
}
print Dumper(\%hash);
Files:
main_dir
├── sub1
│ ├── File.1.age.txt
│ └── File.1.name.txt
└── sub2
├── File.2.age.txt
└── File.2.name.txt
Result:
$VAR1 = {
'age' => [
'./main_dir/sub1/File.1.age.txt',
'./main_dir/sub2/File.2.age.txt'
],
'name' => [
'./main_dir/sub1/File.1.name.txt',
'./main_dir/sub2/File.2.name.txt'
]
};
You can create a loop to concatenate and combine files
Upvotes: 3