Reputation: 4922
guys:
it is hard for me to judge when to escape special characters in shell, and which character should be escaped. for example:
sed '/[0-9]\{3\}/d' filename.txt
like above, why we should escape { while leave [ unchanged, i think they are both special chars.
Can you help me with this?
/br
ruan
Upvotes: 1
Views: 570
Reputation: 10039
It mainly depend on sed version (posix compliant or extended behavior) and then you need to adapt depending of the shell because, indeed, some modification occur before the sed action is received like you state. The best example is the use of simple of double quote at shell level and the \(
or (
at sed level.
so:
let's have fun to create the substitution sed order of \{
by &/$IFS
(literal, not IFS value) using double quote surrounding sed script in BASH/KSH shell and posix or GNU sed.
Upvotes: 0
Reputation: 531718
The general answer is that you need to escape characters that have special meaning when you want to treat them as literal characters, not for their special meaning. The rules for what characters have special meaning vary from program to program.
Your specific question involves characters that have special meaning to sed
; single quotes prevent any enclosed characters from being interpreted by bash
.
In this case, you are escaping the {
and }
to prevent sed
from interpreting them. First, consider this command:
sed '/[0-9]{3}/d' filename.txt
If you are using a version of sed
that treats both [
and {
specially, this command says to delete any line which contains a sequence of exactly 3 digits. The [0-9]
is not a literal 5-character string; it's a regular expression that matches any single numeral. The {3}
isn't a literal 3-character string; it's a modifier that matches exactly 3 of the preceding regular expression. Lines like the following will be matched:
593
3296
but not
34a7
because there aren't 3 digits in a row.
Now, consider your command:
sed '/[0-9]\{3\}/d' filename.txt
The [0-9]
is still a regular expression that matches a single numeral. But now, you have escaped the braces. Instead of being a modifier for the preceding regular expression, sed
will treat it as the literal characters {
, 3
, and }
. So it will match lines like the following:
0{3}
1{3}
5{3}
but not lines like
346
because there are no braces.
Upvotes: 1
Reputation: 785481
Difference in this behavior is related to sed
only.
In regular mode sed supports very basic regex only and hence {
is matched literally unless escaped as you noticed.
sed '/[0-9]\{3\}/d'
In extended regex mode both [ and {
don't need escaping:
sed -r '/[0-9]{3}/d'
OR on OSX:
sed -E '/[0-9]{3}/d'
[ and ]
is considered a character class in both regular and extended regex modes (even shell's glob pattern supports it)
Upvotes: 1
Reputation: 11649
I think your question pertains to special characters in regular expressions. Check this out:
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03
Upvotes: 0