sed regex cannot get first match

Question

I give up... of the following

15 Sep 1605.00 (SPX1530U1605-E),0.25,0.0,0.05,0.10,0,87

I want to extract the number 1530 out of the blob. "SPX" can be any combination of capital letters [A-Z] and varies in length, (e.g. GOOG, FB). There is always a capital letter following the number, as in "U" in the example.

Below gets the second number, 1605. I'm at loss on how to extract 1530.

echo "15 Sep 1605.00 (SPX1530U1605-E),0.0,0.0,266.10,284.60,0,0" | \
gsed -r 's/.*[A-Z]([0-9].*)[-][A-Z].*/\1/g'

It would be acceptable to perform the operation on just the string "SPXW1530I1605-E" rather than the entire line.

hek2mgl · Accepted Answer

Usually grep is the tool of choice when you only want to extract data. You can use GNU grep, it offers perl compatile regular expression when you pass the -P option:

grep -oP '\([A-Z]+\K[0-9]+' file

We are searching for a literal ( followed by 1 or more capital (ASCII) letters. Then we are using \K which cleans up the match buffer. (Nice, isn't it?) The following numbers are the final match.

sed regex cannot get first match

Answers (2)

Related Questions