Sachin
Sachin

Reputation: 1289

Unable to match multiple digits in regex

I am simply trying to print 5 or 6 digit number present in each line.

cat file.txt

Random_something xyz ...64763
Random2 Some String abc-778986
Something something 676347
Random string without numbers

cat file.txt | sed 's/^.*\([0-9]\{5,6\}\+\).*$/\1/'

Current Output

64763
78986
76347
Random string without numbers

Expected Output

64763
778986
676347

The regex doesn't seem to work as intended with 6 digit numbers. It skips the first number of the 6 digit number for some reason and it prints the last line which I don't need as it doesn't contain any 5 or 6 digit number whatsoever

Upvotes: 2

Views: 1329

Answers (2)

anubhava
anubhava

Reputation: 784898

grep is a better for this with -o option that prints only matched string:

grep -Eo '[0-9]{5,6}' file

64763
778986
676347

-E is for enabling extended regex mode.


If you really want a sed, this should work:

sed -En 's/(^|.*[^0-9])([0-9]{5,6}).*/\2/p' file

64763
778986
676347

Details:

  • -n: Suppress normal output
  • (^|.*[^0-9]): Match start or anything that is followed by a non-digit
  • ([0-9]{5,6}): Match 5 or 6 digits in capture group #2
  • .* Match remaining text
  • \2: is replacement that puts matched digits back in replacement
  • /p prints substituted text

Upvotes: 5

RavinderSingh13
RavinderSingh13

Reputation: 133428

With awk, you could try following. Simple explanation would be, using match function of awk and giving regex to match 5 to 6 digits in each line, if match is found then print the matched part.

awk 'match($0,/[0-9]{5,6}/){print substr($0,RSTART,RLENGTH)}' Input_file

Upvotes: 3

Related Questions