Reputation: 49567
My example string is as follows:
This is 02G05 a test string 20-Jul-2012
Now from the above string I want to extract 02G05
. For that I tried the following regex with sed
$ echo "This is 02G05 a test string 20-Jul-2012" | sed -n '/\d+G\d+/p'
But the above command prints nothing and the reason I believe is it is not able to match anything against the pattern I supplied to sed.
So, my question is what am I doing wrong here and how to correct it.
When I try the above string and pattern with python I get my result
>>> re.findall(r'\d+G\d+',st)
['02G05']
>>>
Upvotes: 130
Views: 307168
Reputation: 41945
I know the question asks with sed
, but since it is tagged bash
, I want to point out that you don't need grep
or sed
:
#!/bin/env bash
str="This is 02G05 a test string 20-Jul-2012"
regex="([0-9]+)G([0-9]+)"
if [[ "$str" =~ $regex ]]
then
echo ${BASH_REMATCH[0]}
echo ${BASH_REMATCH[1]}
echo ${BASH_REMATCH[2]}
fi
bash
has its own regex matching, it also support groups.
Result:
02G05
02
05
See this answer for more details.
Upvotes: 1
Reputation: 189387
The pattern \d
might not be supported by your sed
. Try [0-9]
or [[:digit:]]
instead.
To only print the actual match (not the entire matching line), use a substitution.
sed -n 's/.*\([0-9][0-9]*G[0-9][0-9]*\).*/\1/p'
The parentheses capture the text they match into a back reference. Here, the first (and only) parentheses capture the string we want to keep, and we replace the entire line with just the captured string \1
, and print the resulting line. (The p
option says to print the resulting line after performing a successful substitution, and the -n
option prevents sed
from performing its normal printing of every other line.)
Upvotes: 135
Reputation: 111
We can use sed -En to simplify the regular expression, where:
n: suppress automatic printing of pattern space
E: use extended regular expressions in the script
$ echo "This is 02G05 a test string 20-Jul-2012" | sed -En 's/.*([0-9][0-9]+G[0-9]+).*/\1/p'
02G05
Upvotes: 1
Reputation: 50185
How about using grep -E
?
echo "This is 02G05 a test string 20-Jul-2012" | grep -Eo '[0-9]+G[0-9]+'
Upvotes: 139
Reputation: 19
Try using rextract. It will let you extract text using a regular expression and reformat it.
Example:
$ echo "This is 02G05 a test string 20-Jul-2012" | ./rextract '([\d]+G[\d]+)' '${1}'
2G05
Upvotes: 0
Reputation: 51603
Try this instead:
echo "This is 02G05 a test string 20-Jul-2012" | sed 's/.* \([0-9]\+G[0-9]\+\) .*/\1/'
But note, if there is two pattern on one line, it will prints the 2nd.
Upvotes: 8
Reputation: 360085
sed
doesn't recognize \d
, use [[:digit:]]
instead. You will also need to escape the +
or use the -r
switch (-E
on OS X).
Note that [0-9]
works as well for Arabic-Hindu numerals.
Upvotes: 6