Reputation: 440
I have a file reflog
with the content as below. There will be items with same name but different extensions. I want to check that for each of the items (file1
, file2
& file3
here as example), it needs to be exist in both extensions (.abc
and .def
). If both extensions exist, it will perform some regex and print out. Else it will just report out with the file name together with extension (ie, if only on of file1.abc or file1.def exists, it will be printed out).
file1.abc
file2.abc
file2.def
file3.abc
file3.def
file4.abc
file5.abc
file5.def
file6.def
file8abc.def
file7.abc
file1.def
file9abc.def
file10def.abc
My script is as below (editted from yb007 script), but I have some issues with the output that I don;t know how to resolve. I notice the output is going to be wrong when the reflog file having any file with the name *abc.def (such as ie. file8abc.def & file9abc.def). It will be trim down the last 4 suffix and return the wrong .ext (which is .abc here but I suppose it should be .def).
#! /usr/bin/perl
use strict;
use warnings;
my @files_abc ;
my @files_def ;
my $line;
open(FILE1, 'reflog') || die ("Could not open reflog") ;
open (FILE2, '>log') || die ("Could not open log") ;
while ($line = <FILE1>) {
if($line=~ /(.*).abc/) {
push(@files_abc,$1);
} elsif ($line=~ /(.*).def/) {
push(@files_def,$1); }
}
close(FILE1);
my %first = map { $_ => 1 } @files_def ;
my @same = grep { $first{$_} } @files_abc ;
my @abc_only = grep { !$first{$_} } @files_abc ;
foreach my $abc (sort @abc_only) {
$abc .= ".abc";
}
my %second = map {$_=>1} @files_abc;
my @same2 = grep { $second{$_} } @files_def; #@same and same2 are equal.
my @def_only = grep { !$second{$_} } @files_def;
foreach my $def (sort @def_only) {
$def .= ".def";
}
my @combine_all = sort (@same, @abc_only, @def_only);
print "\nCombine all:-\n @combine_all\n" ;
print "\nList of files with same extension\n @same";
print "\nList of files with abc only\n @abc_only";
print "\nList of files with def only\n @def_only";
foreach my $item (sort @combine_all) {
print FILE2 "$item\n" ;
}
close (FILE2) ;
My output is like this which is wrong:- 1st:- print screen output as below: Combine all:- file.abc file.abc file1 file10def.abc file2 file3 file4.abc file5 file6.def file7.abc
List of files with same extension
file1 file2 file3 file5
List of files with abc only
file4.abc file.abc file7.abc file.abc file10def.abc
List of files with def only
file6.def
Log output as below:
**file.abc
file.abc**
file1
file10def.abc
file2
file3
file4.abc
file5
file6.def
file7.abc
Can you pls help me take a look where gies wrong? Thanks heaps.
Upvotes: 1
Views: 1545
Reputation: 1377
use strict;
use warnings;
my @files_abc;
my @files_def;
my $line;
open(FILE,'reflog') || die ("could not open reflog");
while ($line = <FILE>) {
if($line=~ /(.*)\.abc/) {
push(@files_abc,$1);
}
elsif($line=~ /(.*)\.def/) {
push(@files_def,$1);
}
}
close(FILE);
my %second = map {$_=>1} @files_def;
my @same = grep { $second{$_} } @files_abc;
print "\nList of files with same extension\n @same";
foreach my $abc (@files_abc) {
$abc .= ".abc";
}
foreach my $def (@files_def) {
$def .= ".def";
}
print "\nList of files with abc extension\n @files_abc";
print "\nList of files with def extension\n @files_def";
Output is
List of files with same extension
file1 file2 file3 file5
List of files with abc extension
file1.abc file2.abc file3.abc file4.abc file5.abc file7.abc file10def.abc
List of files with def extension
file2.def file3.def file5.def file6.def file8abc.def file1.def file9abc.def
Hope this helps...
Upvotes: 0
Reputation: 754140
You don't need to slurp the whole file; you can read one line at a time. I think this code works on this extended version of your reflog
file:
#!/usr/bin/env perl
use strict;
use warnings;
open my $file, '<', "reflog" or die "Failed to open file reflog for reading ($!)";
open my $func, '>', 'log' or die "Failed to create file log for writing ($!)";
my ($oldline, $oldname, $oldextn) = ("", "", "");
while (my $newline = <$file>)
{
chomp $newline;
$newline =~ s/^\s*//;
my ($newname, $newextn) = ($newline =~ m/(.*)([.][^.]*)$/);
if ($oldname eq $newname)
{
# Found the same file - presumably $oldextn eq ".abc" and $newextn eq ".def"
print $func "$newname\n";
print "$newname\n";
$oldline = "";
$oldname = "";
$oldextn = "";
}
else
{
print $func "$oldline\n" if ($oldline);
print "$oldline\n" if ($oldline);
$oldline = $newline;
$oldname = $newname;
$oldextn = $newextn;
}
}
print $func "$oldline\n" if ($oldline);
print "$oldline\n" if ($oldline);
#unlink "reflog" ;
chmod 0644, "log";
close $func;
close $file;
Since the code does not actually check the extensions, it would be feasible to omit $oldextn
and $newextn
; on the other hand, you might well want to check the extensions if you're sufficiently worried about the input format to need to deal with leading white space.
I very seldom find it good for a processing script like this to remove its own input, hence I've left unlink "reflog";
commented out; your mileage may vary. I would also often just read from standard input and write to standard output; that would simplify the code quite a bit. This code writes to both the log file and to standard output; obviously, you can omit either output stream. I was too lazy to write a function to handle the writing, so the print
statements come in pairs.
This is a variant on control-break reporting.
file1.abc
file1.def
file2.abc
file2.def
file3.abc
file3.def
file4.abc
file5.abc
file5.def
file6.def
file7.abc
$ perl xx.pl
file1
file2
file3
file4.abc
file5
file6.def
file7.abc
$ cat log
file1
file2
file3
file4.abc
file5
file6.def
file7.abc
$
#!/usr/bin/env perl
use strict;
use warnings;
open my $file, '<', "reflog" or die "Failed to open file reflog for reading ($!)";
open my $func, '>', 'log' or die "Failed to create file log for writing ($!)";
my @lines;
while (<$file>)
{
chomp;
next if m/^\s*$/;
push @lines, $_;
}
@lines = sort @lines;
my ($oldline, $oldname, $oldextn) = ("", "", "");
foreach my $newline (@lines)
{
chomp $newline;
$newline =~ s/^\s*//;
my ($newname, $newextn) = ($newline =~ m/(.*)([.][^.]*)$/);
if ($oldname eq $newname)
{
# Found the same file - presumably $oldextn eq ".abc" and $newextn eq ".def"
print $func "$newname\n";
print "$newname\n";
$oldline = "";
$oldname = "";
$oldextn = "";
}
else
{
print $func "$oldline\n" if ($oldline);
print "$oldline\n" if ($oldline);
$oldline = $newline;
$oldname = $newname;
$oldextn = $newextn;
}
}
print $func "$oldline\n" if ($oldline);
print "$oldline\n" if ($oldline);
#unlink "reflog" ;
chmod 0644, "log";
close $func;
close $file;
This is very similar to the original code I posted. The new lines are these:
my @lines;
while (<$file>)
{
chomp;
next if m/^\s*$/;
push @lines, $_;
}
@lines = sort @lines;
my ($oldline, $oldname, $oldextn) = ("", "", ""); # Old
foreach my $newline (@lines)
This reads the 'reflog' file, skipping blank lines, saving the rest in the @lines
array. When the lines are all read, they're sorted. Then, instead of a loop reading from the file, the new code reads entries from the sorted array of lines. The rest of the processing is as before. For your described input file, the output is:
file1
file2
file3
Urgh: the chomp $newline;
is not needed, though it is not otherwise harmful. The old-fashioned chop
(a precursor to chomp
) would have been dangerous. Score one for modern Perl.
Upvotes: 0
Reputation: 3153
open( FILE, "reflog" );
open( FUNC, '>log' );
my %seen;
while ( chomp( my $line = <FILE> ) ) {
$line =~ s/^\s*//;
if ( $ine =~ /(\.+)\.(abc|def)$/ ) {
$seen{$1}++;
}
}
foreach my $file ( keys %seen ) {
if ( $seen{$file} > 1 ) {
## do whatever you want to
}
}
unlink "reflog";
chmod( 0750, "log" );
close(FUNC);
close(FILE);
Upvotes: -1
Reputation: 176
Aside from the errors already pointed out, you appear to be loading @lines from FUNC instead of FILE. Is that also a typo?
Also, If reflog truly contains a series of lines with one filename on each line, why would you ever expect the conditional "if ($line =~ /.abc/ && $line =~ /.def/)" to evaluate true?
It would really help if you could post an example from the actual file you are reading from, along with the actual code you are debugging. Or at least edit the question to fix the typos already mentioned
Upvotes: 1
Reputation: 126732
ALWAYS add
use strict;
use warnings;
to the head of your program. They will catch most simple errors before you need to ask for help.
open FILE, "reflog" or die $!;
$ine
that doesn't exist. You mean $line
chomp @lines;
to remove them||
instead of &&
. Instead write if ($line =~ /\.(iif|isp)$/)
If you still have problems when these are fixed then please ask again.
Upvotes: 4