Blurman
Blurman

Reputation: 609

perl extract string after specific pattern

I wanted to extract (using perl) xxx (string after Block:) and prod (string after Milestone:). The string (after Block: and Milestone:) and number of empty spaces are not standard. I only able to grep the full line using bottom command:

use strict;
use warnings;

my $file = 'xxx.txt';
open my $fh, '<', $file or die "Could not open '$file' $!\n";
while (my $line = <$fh>){
    chomp $line;
#   my @stage_status = $line =~ /(\:.*)\s*$/;
my @stage_status = $line =~ /\b(Block)(\W+)(\w+)/;
    foreach my $stage_statuss (@stage_status){
        print "$stage_statuss\n";
    }
    }

Example of line in a file:

| Block:                   | xxx | Milestone:           | prod        |

Upvotes: 1

Views: 209

Answers (2)

anubhava
anubhava

Reputation: 784958

Using gnu grep you can do:

grep -oP '\b(Block|Milestone)\W+\K\w+' file

xxx
prod

RexEx Details:

  • \b; Word boundary
  • (Block|Milestone): Match Black or Milestone
  • \W+: Match 1+ non-word characters
  • \K: Reset matched info
  • \w+: Match 1+ word characters

Update:

Suggested perl code as per OP's edited question:

use strict;
use warnings;

my $file = 'xxx.txt';
open my $fh, '<', $file or die "Could not open '$file' $!\n";

while (my $line = <$fh>){
    chomp $line;
    print "checking: $line\n";
    my @stage_status = $line =~ /\b(?:Block|Milestone)\W+(\w+)/g;
    
    foreach my $stage_statuss (@stage_status){
       print "$stage_statuss\n";
    }
}

Output:

checking: | Block:                   | xxx | Milestone:           | prod        |
xxx
prod

Upvotes: 1

RavinderSingh13
RavinderSingh13

Reputation: 133428

You could do this with a simple awk. By setting appropriate field separator values we can get the needed value. Simply setting field separator as pipe followed by space OR space occurrences and then in main program checking condition if 2nd field is Block: then print 4th field.

awk -F'\\|[[:space:]]+|[[:space:]]+' '$2=="Block:"{print $4} $6=="Milestone:"{print $8}' Input_file


2nd solution: Almost same solution like my 1st solution above, only thing is making only 1 field separator here for awk.

awk -F'([[:space:]]+)?\\|([[:space:]]+|$)' '$2=="Block:"{print $3} $4=="Milestone:"{print $5}' Input_file

Upvotes: 0

Related Questions