Reputation: 16034
I'm trying to get the text between two tokens.
For example, let's say the text is:
arn:aws:dfasdfasdf/asdfa:start:CaptureThis/end
The output should be: CaptureThis
And the two tokens are: :start:
and /end
The closest I could get was using this regex:
INPUT="arn:aws:dfasdfasdf/asdfa:start:CaptureThis/end"
VALUE=$(echo "${INPUT}" | sed -e 's/:start:\(.*\)\/end/\1/')
... but this returns most of the string: arn:aws:dfasdfasdf/asdfa:start:CaptureThis/end
How do I get all of the other text out of the way?
Upvotes: 2
Views: 132
Reputation: 52291
You could use (GNU) grep with Perl regular expressions (look-arounds) and the -o
option to only return the match:
$ grep -Po '(?<=:start:).*(?=/end)' <<< 'arn:aws:dfasdfasdf/asdfa:start:CaptureThis/end'
CaptureThis
Upvotes: 3
Reputation: 84579
There is no need for any external utilities, bash parameter-expansion will handle it all for you:
INPUT="arn:aws:dfasdfasdf/asdfa:start:CaptureThis/end"
token=${INPUT##*:}
echo ${token%/*}
Output
CaptureThis
Upvotes: 2
Reputation: 439228
Try this:
$ sed 's/^.*:start:\(.*\)\/end.*$/\1/' <<<'arn:aws:dfasdfasdf/asdfa:start:CaptureThis/end'
CaptureThis
The problem with your approach was that you only replaced part of the input line, because your regex didn't capture the entire line.
Note how the command above anchors the regex both at the beginning of the line (^.*
) and at the end (.*$
) so as to ensure that the entire line is matched and thus replaced.
Upvotes: 2
Reputation: 5034
You could use :
VALUE=$(echo "${INPUT}" | sed -e 's/.*:start:\(.*\)\/end.*/\1/')
If the tokens are liable to change, you could use variables - but since "/end" has a "/", that could lead to sed getting confused, so you'd probably want to change its delimiter to some non-conflicting character (like a "?"), so :
TOKEN1=":start:"
TOKEN2="/end"
VALUE=$(echo "${INPUT}" | sed -e "s?.*$TOKEN1\(.*\)$TOKEN2.*?\1?")
Upvotes: 2