Noob_Number_1
Noob_Number_1

Reputation: 755

How is the syntax of regular expression at bash?

I created a regex that finally works for my case

:pkcs7-data\n.+\n\s+(.+?):

You can have a look how it works right here REGEX101 link It has to find the first occurrence of a certain significant number.

I built it using REGEX101 but I have to use it in a bash terminal. My idea is to use that regex in a grep command which gets as an input a file too.

grep -Po ':pkcs7-data\n.+\n\s+(.+?):' file.txt

My problem is that REGEX101 syntax I used doesn't fit for this bash

bash --version
GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

I lookep up some tool (tool1) or files (file1,file2, file3 ) I found but I'm still not able to get anything. I mean, every time I execute grep I don't get anything. I think, the problem must be in some symbols like "\n" or "+" but I'm not succeeding. If I execute something like

grep -Po ':pkcs7-data' file.txt

I got good results. Once I start with symbols like end of line begin the problems.

Upvotes: 1

Views: 154

Answers (3)

Juan Diego Godoy Robles
Juan Diego Godoy Robles

Reputation: 14955

An awk solution:

awk  'BEGIN{FS=" +|:"}/:pkcs7-data/{getline;getline;print $2;exit }' file.txt

pcregrep (if avaliable) is a nice tool to handle multiline regex but i'm can't find a way to get only the matched group:

pcregrep -M -o '(?<=:pkcs7-data)\n.+\n\s+(\d+)' file.txt

Upvotes: 1

Noob_Number_1
Noob_Number_1

Reputation: 755

Thanks to @Rob and @klashxx I found a solution. As @Rob said

"Grep is a line based regular expression tool, it does not handle multi-line patterns like what you have. You should be using Perl or rework your problem into sed or awk."

So grep was to be discarded. And after that, @klashxx added:

An awk solution:

awk 'BEGIN{FS=" +|:"}/:pkcs7-data/{getline;getline;print $2}' file.txt

pcregrep (if avaliable) is a nice tool to handle multiline regex but i'm can't find a way to get only the matched group:

pcregrep -M -o '(?<=:pkcs7-data)\n.+\n\s+(\d+)' file.txt

I tried to solve it with awk. The only problem for me, with @klashxx awk solution it was that I just wanted to get the first occurrence. So I did a little research and find that awk exit stops awk execution. So after first occurrence, it would stop.

awk 'BEGIN{FS=" +|:"}/:pkcs7-data/{getline;getline;print $2; exit;}' file.txt

And now works. Thanks a lot for helping.

Kind regards, Andrés-J. Cremades

Upvotes: 1

Rob
Rob

Reputation: 2656

Grep is a line based regular expression tool, it does not handle multi-line patterns like what you have. You should be using Perl or rework your problem into sed or awk.

Upvotes: 1

Related Questions