Lampros
Lampros

Reputation: 392

Extracting numbers with sed

Input

...
...
PANO_20190917_185957.jpg
BURST20201216180114905_COVER.jpg
BURST20201214164624071_COVER.jpg
IMG_20190317_112951.jpg
IMG_20190317_112939.jpg
IMG_20190317_112936.jpg
IMG_20190317_112947.jpg
IMG_20200326_013746.jpg
...
...

Sed

$ fd . ./ -t f | sed  -E 's/.*\([0-9]\{1,6\}\).*/\1/'  
sed: -e expression #1, char 26: invalid reference \1 on `s' command's RHS

Desired Output

...
...
201909
202012
202012
201903
201903
201903
201903
202003
...
...

Any other way to do this? I've been struggling all day trying to get this working...

Upvotes: 2

Views: 515

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626758

There are several problems with your command:

  1. -E makes sed treat your regex as a POSIX ERE but the syntax it complies with is that of POSIX BRE,
  2. You match any text with .* that grabs all text greedily, and the next digit matching pattern always matches just a single digit since it can match a single digit,
  3. you want exactly 6 digits in the output, but you allow 1 to 6.

You can use the following POSIX BRE solution:

sed  -n 's/[^0-9]*\([0-9]\{6\}\).*/\1/p'

See the online demo where

  • -n suppresses the default line output
  • [^0-9]*\([0-9]\{6\}\).* matches zero or more chars other than digits, then captured exactly 6 digits into Group 1, and then matches the rest of the string, and
  • \1 - replaces the whole match with Group 1 value
  • p - prints the result of the substitution.

Upvotes: 0

anubhava
anubhava

Reputation: 785098

You may use this sed:

sed -E 's/^[^0-9]*([0-9]{1,6}).*/\1/' file

201909
202012
202012
201903
201903
201903
201903
202003

RegEx Explained:

  • -E: Enable extender regex mode (ERE)
  • ^: Start
  • [^0-9]*: Match 0 or more non-digits
  • ([0-9]{1,6}): Match 1 to 6 digits in 1st capture group
  • .*: Match 0 or more of any characters

Upvotes: 3

Related Questions