Reputation: 1939
I try to grep a text from a log file on a linux bash.The text is within two square brackets.
e.g. in:
32432423 jkhkjh [234] hkjh32 2342342
I am searching 234
.
usually that should find it
\[(.*?)\]
but not with
|grep \[(.*?)\]
what is the correct way to do the regular expression search with grep
Upvotes: 6
Views: 13253
Reputation: 626690
To grep all values between square brackets including the brackets you may use a POSIX BRE based grep
command like
grep -o '\[[^][]*]' file
...and BONUS solutions of the same kind:
grep -o '<[^<>]*>' file # Extracting all strings between angle brackets
grep -o '([^()]*)' file # Extracting all strings between parentheses
grep -o '{[^{}]*}' file # Extracting all strings between curly braces
grep -o '"[^"]*"' file # Extracting all strings between double quotes
grep -o "'[^']*'" file # Extracting all strings between single quotes
See the online grep
demo. The -o
option makes grep
output matched substrings only, not whole lines, and the \[[^][]*]
pattern matches a [
, then 0 or more occurrences of any chars but [
and ]
(see the negated [^][]*
bracket expression), and then a ]
.
If you need to get the value inside square brackets excluding the square brackets, you can use a PCRE regex based grep
commands like
grep -oP '\[\K[^][]*(?=])' file
The \[\K[^][]*(?=])
pattern matches
\[
- a [
char\K
- a match reset operator that discards the text matched so far from the match memory buffer[^][]*
- 0 or more chars other than ]
and [
(?=])
- a positive lookahead that requires a ]
char immediately to the right of the current location.Upvotes: 7
Reputation: 37394
I prefer \\[[^]]*]
(that's: \\[ [ ^] ]* ]
, ie. anything-but-right-square-brackets in square brackets) over \\[.*]
because of greediness:
$ grep -o \\[.*] <<<"[this] and that too]"
[this] and that too]
vs.
$ grep -o \\[[^]]*] <<<"[this] and that too]"
[this]
Then again grep
is not the tool for everything (it was g/re/p
after all). If you just want what's inside the square brackets, I'd use sed
for that:
$ sed 's/.*\[\([^]]*\)].*/\1/' foo
234
ie. replace-everything-with-what's-in-parenthesis...sies.
Upvotes: 1
Reputation: 530920
[
has special meaning to both the shell and grep
, so you need to quote it twice. The backslashes prevent grep
from treating them as part of a bracket expression; quoting the entire thing prevents the shell from trying to expand the regular expression as a pattern before passing it to grep
.
... | grep '\[(.*?)\]'
In your attempt, the shell stripped the backslashes after they were to force the shell to treat them literally, it was approximately to ... | grep '[(.*?)]'
.
Upvotes: 0
Reputation: 289505
You can look for an opening bracket and clear with the \K
escape sequence. Then, match up to the closing bracket:
$ grep -Po '\[\K[^]]*' <<< "32432423 jkhkjh [234] hkjh32 2342342"
234
Note you can omit the -P
(Perl extended regexp) by saying:
$ grep -o '\[.*]' <<< "32432423 jkhkjh [234] hkjh32 2342342"
[234]
However, as you see, this prints the brackets also. That's why it is useful to have -P
to perform a look-behind and look-after.
You also mention ?
in your regexp. Well, as you already know, *?
is to have a regex match behave in a non-greedy way. Let's see an example:
$ grep -Po '\[.*?]' <<< "32432423 jkhkjh [23]4] hkjh32 2342342"
[23]
$ grep -Po '\[.*]' <<< "32432423 jkhkjh [23]4] hkjh32 2342342"
[23]4]
With .*?
, in [23]4]
it matches [23]
. With just .*
, it matches up to the last ]
hence getting [23]4]
. This behaviour just works with the -P
option.
Upvotes: 9