Reputation: 3032
I am trying to use a regular expression with grep command of Linux
(^\s*\*\s*\[ \][^\*]+?(\w*\:[^\*]+\d$)|([^\*]+[.]com[.]au$))
When I am trying it out at https://www.regextester.com with the contents of a file, I am getting the required result, i.e., the required fields are getting matched but when I am trying to use it as
grep '(^\s*\*\s*\[ \][^\*]+?(\w*\:[^\*]+\d$)|([^\*]+[.]com[.]au$))' file1
all it gives me is a null!
What's the problem here?
Upvotes: 7
Views: 956
Reputation: 2116
grep(1)
uses POSIX Basic Regular Expressions by default, and POSIX Extended Regular Expressions when used with the -E
option.
In POSIX Regular Expressions non-special characters have undefined behaviour when escaped, ex. \s
, and there is no syntax for non-greedy matching, ex. +?
. Furthermore, in BREs, the +
and |
operators are not available, and parenthesis must be escaped to perform grouping.
The POSIX character classes [[:space:]]
and [[:alnum:]_]
are a portable alternatives to \s
and \w
respectively.
Excluding the next matching character from a repetition can be used to emulate non-greedy matching, ex. [^*]+?\w*:
is equivalent
to [^*[:alnum:]_:]+[[:alnum:]_]*:
.
The given regular expression can be represented as multiple BREs:
grep -e '^[[:space:]]*\*[[:space:]]\{1,\}\[ \][^*[:alnum:]_+]\{1,\}[[:alnum:]_]*:[^*]\{1,\}[[:digit:]]$' \
-e '[^*]\{1,\}\.com\.au$' file1
or an ERE:
grep -E '^[[:space:]]*\*[[:space:]]*\[ \][^*[:alnum:]_:]+[[:alnum:]_]*:[^*]+[[:digit:]]$|[^*]+\.com\.au$' \
file1
Note that the GNU implementation of grep(1)
allows for both short character classes (\s
and \w
) and non-greedy repetition (+?
), as non-portable extensions.
Upvotes: 0
Reputation: 3032
pcregrep -M '(^\s*\*\s*\[ \][^\*]+?(\w*\:[^\*]+\d$)|([^\*]+[.]com[.]au$))'
did the trick :)
Upvotes: 2
Reputation: 28049
I don't think grep
understands character classes like \w
and \s
. Try using either grep -E
or egrep
. (grep -E
is equivalent to egrep
, egrep
is just shorter to type.)
So your command would be:
egrep '(^\s*\*\s*\[ \][^\*]+?(\w*\:[^\*]+\d$)|([^\*]+[.]com[.]au$))' file1
Upvotes: 3