PLT
PLT

Reputation: 21

How to split files using Perl?

Each div should be separated as individual files.

Input.txt

[[div]]
line 1
line 2
...
[[/div]]

[[div]]
line 3
line 4
line 5
...
[[/div]]

[[div]]
line 6
line 7
...
[[/div]]

filename.txt

fm.html
chap01.html
bm.html

Output needed

fm.html

<html>
<body>
line 1
line 2
...
</body>
</html>

chap01.html

<html>
<body>
line 3
line 4
line 5
...
</body>
</html>

bm.html

<html>
<body>
line 6
line 7
...
</body>
</html>

Coding that i have tried now.. but it returns last div in all files... And need to add meta...Kindly need solution

#!/usr/bin/perl
open(REDA,"filename.txt");
@namef=<REDA>;
open(RED,"input.txt");
open(WRITX,">input1.txt");
while(<RED>)
   {
    chomp($_);
    $_="$_"."<cr>";
    print WRITX $_;
   }
close(RED);
close(WRITX);
open(REDQ,"input1.txt");
open(WRITQ,">input2.txt");
while(<REDQ>)
   {
                $_=~s/\[\[div\]\]<cr>/\n\[\[div\]\]/gi;
    print WRITQ $_;
   }
close(REDQ);
close(WRITQ);
open(REDE,"input2.txt");
while(<REDE>)
   {
   foreach $namef (@namef)
    {
         chomp($namef);
         $namef=~s/\.[a-z]+//gi;
        open(WRIT1,">$namef.html");
            if(/\[\[div\]\]/i)
            {
                chomp($_);
                $_=~s/<cr>/\n/gi;
                print WRIT1 $_;
            }
         }
    }
close(REDA);
close(REDE);
close(REDX);
close(WRIT1);
system ("del input1.txt");
system ("del input2.txt");

Upvotes: 0

Views: 1460

Answers (4)

Dave Cross
Dave Cross

Reputation: 69244

Writing it in rather more idiomatic Perl, you might get something like this:

#!/usr/bin/perl

use strict;
use warnings;

# First argument is the name of the file that contains
# the filenames.
open my $fn, shift or die $!;
chomp(my @files = <$fn>);

# Variable to contain the current open filehandle
my $curr_fh;
while (<>) {
  # Skip blank lines
  next unless /\S/;

  # If it's the opening of a div...
  if (/\[\[div]]/) {
    # Open the next file...
    open $curr_fh, '>', shift @files or die $!;
    # Print the opening html...
    print $curr_file "<html>\n<body>\n";
    # ... and skip the rest of the loop
    next;
  }

  # If it's the end of a div
  if (/\[\[\/div]]/) {
    # Print the closing html...
    print $curr_fh "</body>\n</html>\n";
    # Close the current file...
    close $curr_fh;
    # Unset the variable so we can reuse it...
    undef $curr_fh;
    # and skip the rest of the loop
    next;
  }

  # Otherwise, just print the record to the currently open file
  print $curr_fh $_;
}

Call it with two arguments, the name of the file containing the the filenames (filename.txt) followed by the name of the file containing the data (input.txt).

Upvotes: 0

terdon
terdon

Reputation: 3370

You could do something like this:

#!/usr/bin/env perl
use strict;
use warnings;

my @file_names;
## Read the list of file names
open(my $fh,"$ARGV[0]");
while (<$fh>) {
    chomp; #remove new line character from the end of the line
    push @file_names,$_;
}

my $counter=0;
my ($file_name,$fn);
## Read the input file
open($fh,"$ARGV[1]");
while (<$fh>) {
    ## If this is an opening DIV, open the next output file,
    ## and set $counter to 1.
    if (/\[\[div\]\]/) {
    $counter=1;
    $file_name=shift(@file_names);
    open($fn, '>',"$file_name");
    }
    ## If this is a closing DIV, print the line and set $counter back to 0
    if (/\[\[\/div\]\]/) {
    $counter=0;
    print $fn $_;
    close($fn);
    }
    ## Print into the corresponding file handle if $counter is 1
    print $fn $_ if $counter==1
}

Save the script as foo.pl and run it like this:

perl foo.pl filename.txt Input.txt 

Upvotes: 1

Joseph R.
Joseph R.

Reputation: 805

If you're sure the [[div]] sections are separated by blank lines, you can make use of Perl's paragraph mode slurp which divides a file into chunks separated by one or more blank lines. The following code (tested) does what you need. Execute the following in a terminal where the current directory contains the relevant files:

perl -n00 -e '
    BEGIN{ #Executed before input.txt is read
        open $f,"<","filename.txt";
        @names = split /\n+/,<$f> #Split is needed because we changed the input record separator
    }

    # The following is executed for each "paragraph" (div section)
    s!\[\[div\]\]\n!<html>\n<body>\n!; # substitute <html>\n<body\n instead of [[div]]
    s!\[\[/div\]\]\n!</body>\n</html>!; # substitute </body>\n</html> instead of [[/div]]
    $content{shift @names}=$_; #Add the modified content to hash keyed by file name

    END{ #This is executed after the whole of input.txt has been read
        for(keys %content){ #For each file we want to create
            open $of,">",$_;
            print $of $content{$_}
        }
    }
' input.txt

Update

If you want to use the above code as a Perl script, you can do the following:

#!/usr/bin/env perl

use strict;
use warnings;

open my $f,'<','filename.txt' or die "Failed to open filename.txt: $!\n";
my @names;
chomp(@names=<$f>);

open my $if,'<','input.txt' or die "Failed to open input.txt: $!\n";
my %content;
while(my $paragraph=do{local $/="";<$if>}){
    $paragraph=~ s!\[\[div\]\]\n!<html>\n<body>\n!;
    $paragraph=~ s!\[\[/div\]\]\n!</body>\n</html>!;
    $content{shift @names}=$paragraph;
}

for(keys %content){
    open my $of,'>',$_ or die "Failed to open $_ : $!\n";
    print $of $content{$_}
}

Save the above as (say) split_file.pl, make it executable via chmod +x split_file.pl then run it as ./split_file.pl.

Upvotes: 1

slm
slm

Reputation: 16416

In Perl you can loop through the contents of file filename.txt like so:

#!/usr/bin/perl

# somescript.pl

open (my $fh, "<", "filename.txt");
my @files = <$fh>;
close ($fh);

foreach my $file (@files) {
    print "$file";
}

Put the above in a file called somescript.pl, make it executable, chmod +x somescript.pl, and run it:

$ ./somescript.pl 
fm.html
chap01.html
bm.html

You can see that it's now reading in the file filename.txt and printing each line out to the screen. I leave the rest to you to try. If you get stuck ask for help.

I would use the same approach that I did to read in the filename.txt file for reading in the input.txt file.

Upvotes: 0

Related Questions