michael_stackof
michael_stackof

Reputation: 223

Escape character confusing in sed command

There is a common usage that sed to escape . sed 's/\\\. / \\\\\\\\./g'
I readlly confused why so many \

Upvotes: 1

Views: 2936

Answers (3)

OshoParth
OshoParth

Reputation: 1552

Simplest way to decode the escape sequences is to divide them in the pairs(one slash and another is the character connected to it.) then you can simply evaluate these pairs individually. However \\ is used to denote single slash in the output.

some examples are :

a \ - Append
i \ - Insert
s/..../..../ - Substitute


Sed Pattern Flags
/g - Global
/I - Ignore Case
/p - Print
/w filename - Write Filename

For Further reference view : http://www.grymoire.com/unix/sed.html

Upvotes: 1

NeronLeVelu
NeronLeVelu

Reputation: 10039

/\\\. /

search for the pattern \.

by default \ is the sequence for escaping following char and dot mean "any char", so \. is translate in regex format by escape slash (\\) followed by escape dot (\.)

/ \\\\\\\\./

replace by \\. there are 8 slash and 1 dot. 4 escaped slash for same reason as for research pattern earlier but in replacement pattern dot mean dot so no need to escape it.

another way to write it /[\][.]/\\\\\\&/ or for fun /\([\]\)[.]/\1\1\1&/ take

Upvotes: 1

Jonathan Leffler
Jonathan Leffler

Reputation: 754440

There is no obvious reason why that pair of patterns is desirable, but what it does is look for a backslash, a dot and a space, and replaces that sequence by a space, four backslashes and a dot; all of which is done for each backslash, dot, space sequence in the original input.

The substitute command is:

s/\\\. / \\\\\\\\./

In the first part of the substitute command, the match part:

  • You have a pair of backslashes; these match a single backslash.
  • You have a backslash dot pair; these match a single dot (normally, a dot matches any character, so the backslash suspends the special metacharacter meaning for dot).
  • You have a space.

In the second part of substitute command, the replacement part:

  • You have a space.
  • You have four pairs of backslashes, each of which places a single backslash in the replacement.
  • You have a dot.

But beyond testing your ability to write regular expressions in sed, there is no obvious reason why this is an appropriate substitution.

Note that because the sed script expression is inside single quotes, the shell is not doing any interpreting of the contents of the string. If it was enclosed in double quotes, then the shell would process the argument, and would remove five of the backslashes before sed got to see the expression, leading to a different interpretation of what goes on. This is an excellent reason for using single quotes around regular expressions whenever possible.

Upvotes: 1

Related Questions