Reputation: 337
I'm trying to use egrep with a regex pattern to match whitespace.
I've used RegEx with Perl and C# before and they both support the pattern \s
to search for whitespace. egrep (or at least the version I'm using) does not seem to support this pattern.
In a few articles online I've come across a shorthand [[:space:]], but this does not seem to work. Any help is appreciated.
Using: SunOS 5.10
Upvotes: 22
Views: 63370
Reputation: 1260
If you are using bash, then syntax to put a tab in a line is
$'foo\tbar'
I was recently working with sed to do some fixups on a tab-delimited file. Part of the file was:
sed -E -e $'s/\t--QUOTE--/\t"/g'
That argument is parsed by bash, and sed sees a regex with literal tabs.
Upvotes: 3
Reputation: 11220
$ cat > file
this line has whitespace
thislinedoesnthave
$ egrep [[:space:]] file
this line has whitespace
Works under debian.
For Solaris, isn't there an "eselect" like (see gentoo) or alternatives file to set default your egrep version?
Have you tried grep -E, because if the egrep that is on your path is not the good one, maybe grep is.
Upvotes: -3
Reputation: 21525
I see the same issue on SunOS 5.10. /usr/bin/egrep
does not support extended regular expressions.
Try using /usr/xpg4/bin/egrep
:
$ echo 'this line has whitespace
thislinedoesnthave' | /usr/xpg4/bin/egrep '[[:space:]]'
this line has whitespace
Another option might be to just use perl:
$ echo 'this line has whitespace
thislinedoesnthave' | perl -ne 'chomp;print "$_\n" if /[[:space:]]/'
this line has whitespace
Upvotes: 25
Reputation: 882466
If you're using 'degraded' versions of grep (I quote the term because most UNIX'es I work on still use the original REs, not those fancy ones with "\s
" or "[[:space:]]
" :-), you can just revert to the lowest form of RE.
For example, if :space:
is defined as spaces and tabs, just use:
egrep '[ ^I]' file
That ^I
is an actual tab character, not the two characters ^
and I
.
This is assuming :space:
is defined as tabs and spaces, otherwise adjust the choices within the []
characters.
The advantage of using degraded REs is that they should work on all platforms (at least for ASCII; Unicode or non-English languages may have different rules but I rarely find a need).
Upvotes: 15
Reputation: 11247
Maybe you should protect the pattern with quotes (if bash, or anything equivalent for the shell you are using).
[ and ] may have special meaning for the shell.
Upvotes: 0