user8450886
user8450886

Reputation: 11

Text file manipulation in perl

Here I am trying to split the file from # start data to # end data and if the string 'Pen' or 'Laptop' is present,the code should continue without writing into the file, if not it should write into the out file.

 Input
         # start data a1   
         Data1 Book 1234  
         Data1 Pen 54635  
         Data1 Laptop 4567  
         Data1 Lens 6473  
         # end data a1  
         # start data a2   
         Data2 Book 1234  
         Data2 Box 54635  
         Data2 Card 4567  
         Data2 Lens 6473  
         # end data a2   

 Expected ouput  

        # start data a2   
        Data2 Book 1234  
        Data2 Box 54635  
        Data2 Card 4567  
        Data2 Lens 6473  
        # end data a2  

The Code snipppet used:

#!/usr/local/perl
use warnings;
use strict;
open(filein, "<Input.txt");
open(fileout, ">ouput.txt");
my @array;
my $strt =qr/^#\sstart\sdata/;
my $end=qr/^#\send\sdata/; 
while(<filein>)
{
     @array= split(/$strt/../$end/,$_);
     foreach my $i(@array)
     {
        if($i =~ /Pen|Laptop/)
        {
            next;
        }
        else
        {
            print fileout "$_";
        }
    }
}
close(filein);
close(fileout);  



 Obtained Output from the above snippet  
    # start data a1   
    Data1 Book 1234    
    Data1 Book 1234  
    Data1 Pen 54635    
    Data1 Laptop 4567    
    Data1 Lens 6473   
    # end data a1        
    # start data a2      
    Data1 Book 1234    
    Data1 Book 1234  
    Data1 Box 54635  
    Data1 Box 54635  
    Data1 Card 4567    
    Data1 Card 4567  
    Data1 Lens 6473  
    # end data a2     

Upvotes: 0

Views: 804

Answers (2)

kart1657
kart1657

Reputation: 78

below script will give you almost the desired output

#!/usr/bin/perl

open (FH,"text.txt") || die "Not able to open text.txt $!";
@values=();
while($line = <FH>)
{
        unless($line=~/end data/)
        {
                chomp($line);
                push(@values,$line);
                next;
        }

        if ( grep{ $_ =~ /Pen|Laptop/i} @values )
        {
                @values=();
        }
        else
        {
                open(FH2,">>newtext.txt") || die "Not able to open newtext.txt $!";
                foreach (@values)
                {
                        print FH2 "$_\n";
                }
                close(FH2);
                @values=();
        }
}
close(FH);

content of text.txt :-

# start data a1
 Data1 Book 1234
 Data1 Pen 54635
 Data1 Laptop 4567
 Data1 Lens 6473
 # end data a1
 # start data a2
 Data2 Book 1234
 Data2 Box 54635
 Data2 Card 4567
 Data2 Lens 6473
 # end data a2
 # start data a3
 Data2 Book 1234
 Data2 Box 54635
 Data2 Lamp 4567
 Data2 Lens 6473
 # end data a3

output in newtext.txt:-

# start data a2   
Data2 Book 1234  
Data2 Box 54635  
Data2 Card 4567  
Data2 Lens 6473  
# start data a3
Data2 Book 1234
Data2 Box 54635
Data2 Lamp 4567
Data2 Lens 6473

Upvotes: 0

Chris Charley
Chris Charley

Reputation: 6633

The range operator can't be used as an argument to split - it requires a /PATTERN/.

I can't explain the results you got from your code with the incorrect usage of split. Its really behaving weirdly!

A few comments on your code.

You are using strict and warnings. A good practice to find errors in code being developed

You should use the preferred 3 argument to open files, preferring a lexical filehandle, $in to a bareword filehandle, filein. And should always check to see that the file opened without errors, . . . or die $!.

open(filein, "<Input.txt"); better written as - open my $in, '<', 'Input.txt' or die $!;

print fileout "$_"; the quotes around $_ are unneccesary, just print the $_ variable

A working program that gets the output you want using some perl features, could be (below) -

open my $out, '>', 'file2' or die $!;

{
    local $/ = "# end data\n";
    while (<$in>) {
        print $out $_ unless /Pen|Laptop/;  
    }
}

The default input record separator is \n. Here, I defined it, (local to the block), to "# end data\n".

(creating a block isn't necessary in this case, but should generally be done so that that when the block goes out of scope, the input record separator regains it's previous value - here the default value of \n. local only uses the value assigned in the scope of the block)

So, this program reads in chunks of lines rather a line at a time, (because $/ separator is "# end data\n" instead of "\n".

Upvotes: 1

Related Questions