user1495523
user1495523

Reputation: 505

perl split file with selective data

I am trying to split a very large file into smaller files based string within the file. Along with the which I would like to filter non-required elements, by selecting only the elements desired in the list.

example input

 Block(A_1){
   Block_area : 2.6112;
   Block_footprint : 3BAA5927A22E66B0AE1214A806440F12;
   Block_Coordinates {
    values ("0 , 0",\
        "50, 50");
    }
   Block_connection : "North";
 }
 Block(BX_q_2_1){
   Block_area : 2.6112;
   Block_footprint : 3BAA5927A22E66B0AE1214A806440F12;
   Block_Coordinates {
    values ("20 , 20",\
        "20, 70");
   Block_connection : "South";
 }
 Block(C_2_r){
   Block_area : 2.6112;
   Block_footprint : 3BAA5927A22E66B0AE1214A806440F12;
   Block_Coordinates {
    values ("50 , 50",\
        "10, 500");
   Block_connection : "North-West";
 }

Output is three files grep Block_area & Block_Coordinates entries The sample input has a lot of other data hence I would like to grep using regex.

A_1.txt

 Block(A_1){
   Block_area : 2.6112;
   Block_Coordinates {
    values ("0 , 0",\
        "50, 50");
    }
 }

BX_q_2_1.txt

 Block(BX_q_2_1){
   Block_area : 2.6112;
   Block_Coordinates {
    values ("20 , 20",\
        "20, 70");
 }

C_2_r.txt

 Block(C_2_r){
   Block_area : 2.6112;
   Block_Coordinates {
    values ("50 , 50",\
        "10, 500");
 }

I was earlier helped to split the file

while (<>) {
  my ($file) = m|\( (.+?) \)|x or next; 
  open my $fh, ">", "$file.txt";
  print $fh $_;
  close $fh;
}

alternately

while (<$in_fh>) {
  open $out_fh, '>', "$1.txt" if / Block \( (\w+) \) /x;
  print $out_fh $_ if $out_fh;
}

But I am not able to include selective data.

regards

Upvotes: 1

Views: 80

Answers (2)

choroba
choroba

Reputation: 241828

To only output specific keywords, I'd use the following program:

#!/usr/bin/perl
use warnings;
use strict;

my $OUT;
while (<>) {
    if (my ($filename) = /Block \( (.*?) \){/x) {
        open $OUT, '>', "$filename.txt" or die $!;
    }

    print {$OUT} $_ if ! /Block_/                         # header & inner values
                    or /Block_(?: area | Coordinates )/x; # keywords

}

It doesn't work if you need to skip multiline entries, though.

Upvotes: 1

vks
vks

Reputation: 67968

If you are willing to use match and groups

(Block\([^)]*\){(?:(?!\bBlock_connection).)*)

Try this.This will give all required groups.Set flags s and g.See demo.

http://regex101.com/r/rQ6mK9/41

or

you can split by Block_connection\s+:\s+"[^"]+";\s+}.

See demo.

http://regex101.com/r/rQ6mK9/43

Upvotes: 0

Related Questions