Jill448
Jill448

Reputation: 1793

Perl grep a multi line output for a pattern

I have the below code where I am trying to grep for a pattern in a variable. The variable has a multiline text in it.

Multiline text in $output looks like this

 _skv_version=1
 COMPONENTSEQUENCE=C1-

 BEGIN_C1
        COMPONENT=SecurityJNI
TOOLSEQUENCE=T1-
END_C1
CMD_ID=null
CMD_USES_ASSET_ENV=null_jdk1.7.0_80
CMD_USES_ASSET_ENV=null_ivy,null_jdk1.7.3_80
BEGIN_C1_T1
 CMD_ID=msdotnet_VS2013_x64
CMD_ID=ant_1.7.1
CMD_FILE=path/to/abcI.vc12.sln
BEGIN_CMD_OPTIONS_RELEASE
    -useideenv

The code I am using to grep for the pattern

use strict; 
use warnings; 

my $cmd_pattern = "CMD_ID=|CMD_USES_ASSET_ENV="; 
my @matching_lines; 
my $output = `cmd to get output` ; 
print "output is : $output\n"; 

if ($output =~ /^$cmd_pattern(?:null_)?(\w+([\.]?\w+)*)/s ) { 
         print "1 is : $1\n"; 
          push (@matching_lines, $1); 
  } 

I am getting the multiline output as expected from $output but the regex pattern match which I am using on $output is not giving me any results.

Desired output

jdk1.7.0_80
ivy
jdk1.7.3_80
msdotnet_VS2013_x64
ant_1.7.1

Upvotes: 0

Views: 1395

Answers (1)

José Castro
José Castro

Reputation: 671

Regarding your regular expression:

  • You need a while, not an if (otherwise you'll only be matching once); when you make this change you'll also need the /gc modifiers
  • You don't really need the /s modifier, as that one makes . match \n, which you're not making use of (see note at the end)
  • You want to use the /m modifier so that ^ matches the beginning of every new line, and not just the beginning of the string
  • You want to add \s* to your regular expression right after ^, because in at least one of your lines you have a leading space
  • You need parenthesis around $cmd_pattern; otherwise, you're getting two options, the first one being ^CMD_ID= and the second one being CMD_USES_ASSET_ENV= followed by the rest of your expression

You can also simplify the (\w+([\.]?\w+)*) bit down to (.+).

The result would be:

while ($output =~ /^\s*(?:$cmd_pattern)(?:null_)?(.+)/gcm ) {            
  print "1 is : $1\n";              
  push (@matching_lines, $1); 
}

That being said, your regular expression still won't split ivy and jdk1.7.3_80 on its own; I would suggest adding a split and removing _null with something like:

while ($output =~ /^\s*(?:$cmd_pattern)(?:null_)?(.+)/gcm ) {           
  my $text = $1;
  my @text;
  if ($text =~ /,/) {
    @text = split /,(?:null_)?/, $text;
  }
  else {
    @text = $text;
  }

  for (@text) {
    print "1 is : $_\n";
    push (@matching_lines, $_);
  }
}   

The only problem you're left with is the lone line CMD_ID=null. I'm gonna leave that to you :-)

(I recently wrote a blog post on best practices for regular expressions - http://blog.codacy.com/2016/03/30/best-practices-for-regular-expressions/ - you'll find there a note to always require the /s in Perl; the reason I mention here that you don't need it is that you're not using the ones you actually need, and that might mean you weren't certain of the meaning of /s)

Upvotes: 1

Related Questions