Reputation: 355
Please try
egrep "^[a-z]{3}$" /usr/share/dict/words
egrep "^[[:lower:]]{3}$" /usr/share/dict/words
The first one returns both uppercase and lowercase words. The second one returns lowercase words only.
Upvotes: 1
Views: 260
Reputation: 162801
Are you sure? On my system (OS X Snow Leopard), both commands return exactly the same results; all 3 letter lower case words only.
$ egrep "^[a-z]{3}$" /usr/share/dict/words | wc -l
1134
$ egrep "^[[:lower:]]{3}$" /usr/share/dict/words | wc -l
1134
$ egrep "^[[:lower:]]{3}$" /usr/share/dict/words | md5
0a66d5e78cfbe6f9f66d2d90b1053972
$ egrep "^[a-z]{3}$" /usr/share/dict/words | md5
0a66d5e78cfbe6f9f66d2d90b1053972
What system are you using? Perhaps try man egrep
and look for a case sensitivity option. The egrep
that ships with OSX offers only the opposite -i, --ignore-case ignore case distinctions
.
I've also verified this on a CentOS linux box too:
$ egrep "^[a-z]{3}$" /usr/share/dict/words | wc -l
2044
$ egrep "^[[:lower:]]{3}$" /usr/share/dict/words | wc -l
2044
$ egrep "^[a-z]{3}$" /usr/share/dict/words | md5sum
480fb21554f9f731adddb0d648157926 -
$ egrep "^[[:lower:]]{3}$" /usr/share/dict/words | md5sum
480fb21554f9f731adddb0d648157926 -
It appears by your comments that you may be passing the -i
or --ignore-case
option to egrep
. Turn that off to get only the lower case results.
Upvotes: 1
Reputation: 881643
It has to do with your locale setting. If you set LC_ALL
to C
, it should work as expected.
From the egrep
manpage under Ubuntu 11.04:
Within a bracket expression, a range expression consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, inclusive, using the locale's collating sequence and character set.
For example, in the default C locale,
[a-d]
is equivalent to[abcd]
. Many locales sort characters in dictionary order, and in these locales[a-d]
is typically not equivalent to[abcd]
; it might be equivalent to[aBbCcDd]
, for example. To obtain the traditional interpretation of bracket expressions, you can use the C locale by setting theLC_ALL
environment variable to the valueC
.
You can try the commands from the following transcript to confirm this:
pax$ egrep "^[a-z]{3}$" /usr/share/dict/words | head -5l
AOL
Abe
Ada
Ala
Ali
pax$ LC_ALL=C egrep "^[a-z]{3}$" /usr/share/dict/words | head -5l
ace
act
add
ado
ads
Upvotes: 4