Reputation: 2304
I have a file containing these lines:
SOME COMMAND 34 XXXXX ;
; a comment which may contain a :
sometext001 : X00 : 1 ;
: X01 : 1 ;
: X11 : 1 ;
And want to retrieve sometext001
with grep/egrep.
Using the regex ^\s*[^:\s;]+\s*:
(in words: starting at the beginning of the line with some or none whitespace, followed by at least one character not a whitespace, colon or semicolon followed again by some or none whitespaces followed by a colon)
I'm able to match the text (including the following :) using an online regex tester http://regexr.com?35eam if I enable multiline support.
I was under the impression that grep/egrep works line by line anyway, so why does the regex not work when used with egrep on a file containing this example?
Is there another way to achive the desired result with egrep or, if that's not possible, with another one-liner callable from a shell script?
Update: although the proposed change of the regex to ^[[:space:]]*[^[:space:];]+[[:space:]]*:
matches the lines specified, it it still matches twice in that line, once for sometext001 :
and once for X00 :
as evident when using the -o option to egrep.
How to solve this?
Update: The test file contained exactly the text given above. The command line was egrep -o '^([[:space:]]*[^:[:space:];]+[[:space:]]*:)' test.txt
(also tried without the () pair). Output is
sometext001 :
X00 :
Upvotes: 1
Views: 514
Reputation: 195169
with gnu grep:
grep -Po '^\s*\K[^\s:;]*(?= :)'
with yourexample:
kent$ echo "SOME COMMAND 34 XXXXX ;
; a comment which may contain a :
sometext001 : X00 : 1 ;
: X00 : 1 ;
: X11 : 1 ;"|grep -Po '^\s*\K[^\s:;]*(?= :)'
sometext001
Upvotes: 0
Reputation: 785376
You should better use -P
(perl like regex switch) with the regex that you have:
grep -P '^\s*[^:\s;]+\s*:'
Upvotes: 1
Reputation: 336308
egrep
uses POSIX EREs by default, and those don't recognize \s
and other Perl-style shorthands. Try
^[[:space:]]*[^:[:space:];]+[[:space:]]*:
Upvotes: 2