Reputation: 3638
I'm having a bit of an issue with grep that I can't seem to figure out. I'm trying to search for all instances of lower case words enclosed in double quotes (C strings) in a set of source files. Using bash and gnu grep:
grep -e '"[a-z]+"' *.cpp
gives me no matches, while
grep -e '"[a-z]*"' *.cpp
gives me matches like "Abc" which is not just lower case characters. What is the proper regular expression to match only "abc"?
Upvotes: 10
Views: 8238
Reputation: 503
If you don't want to mess about with locales, this worked for me:
grep -e '"[[:lower:]]\+"'
Upvotes: 0
Reputation: 36250
Mask the +
grep -e '"[a-z]\+"' *.cpp
or use egrep:
egrep '"[a-z]+"' *.cpp
maybe you had -E in mind:
grep -E '"[a-z]+"' *.cpp
The lowercase -e is used, for example, to specify multiple search patterns.
The phaenomenon of uppercase characters might origin from your locale - which you can prevent with:
LC_ALL=C egrep '"[a-z]+"' *.cpp
Upvotes: 1
Reputation: 137987
You're forgetting to escape the meta characters.
grep -e '"[a-z]\+"'
For the second part, the reason it is matching multi-case characters is because of your locale. As follows:
$ echo '"Abc"' | grep -e '"[a-z]\+"'
"Abc"
$ export LC_ALL=C
$ echo '"Abc"' | grep -e '"[a-z]\+"'
$
To get the "ascii-like" behavior, you need to set your locale to "C", as specified in the grep man page:
Within a bracket expression, a range expression consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, inclusive, using the locale's collating sequence and character set. For example, in the default C locale, [a-d] is equivalent to [abcd]. Many locales sort characters in dictionary order, and in these locales [a-d] is typically not equivalent to [abcd]; it might be equivalent to [aBbCcDd], for example. To obtain the traditional interpretation of bracket expressions, you can use the C locale by setting the LC_ALL environment variable to the value C.
Upvotes: 9
Reputation: 127538
You probably need to escape the +
:
grep -e '"[a-z]\+"' *.cpp
Upvotes: 0