Tirafesi
Tirafesi

Reputation: 1469

How to use capture groups with sed?

I'm trying to replace some text in a file using sed but I'm having troubles.

sed -ir 's/(\$hello = )true/\1false/' /path/to/my/file.txt gives the error sed: -e expression #1, char 27: invalid reference \1 on 's' command's RHS.

I want to replace $hello = true with $hello = false, so in order to avoid typing $hello = twice I wanted to use capture groups - which isn't working.

What am I doing wrong?

Upvotes: 1

Views: 803

Answers (2)

luciole75w
luciole75w

Reputation: 1117

You don't have to escape parentheses in extended regex mode, if it was your intent with the r into -ir, but actually if you want both options -i and -r then you have to keep them apart or use -ri instead of -ir because the latter interprets the part after -i as an optional backup suffix.

From sed manual

Because -i takes an optional argument, it should not be followed by other short options:

sed -Ei '...' FILE

Same as -E -i with no backup suffix - FILE will be edited in-place without creating a backup.

sed -iE '...' FILE

This is equivalent to --in-place=E, creating FILEE as backup of FILE

Upvotes: 2

Olaf Dietsche
Olaf Dietsche

Reputation: 74018

You must escape the parenthesis with backslashes \(...\), to be used as grouping.

See THE SED FAQ, section "3.1.2. Escape characters on the right side of "s///"" has an example:

3.1.2. Escape characters on the right side of "s///"

The right-hand side (the replacement part) in "s/find/replace/" is almost always a string literal, with no interpolation of these metacharacters:

  .   ^   $   [   ]   {   }   (   )  ?   +   *   |

Three things are interpolated: ampersand (&), backreferences, and options for special seds. An ampersand on the RHS is replaced by the entire expression matched on the LHS. There is never any reason to use grouping like this:

  s/\(some-complex-regex\)/one two \1 three/

And later in section "F. GNU sed v2.05 and higher versions":

F. GNU sed v2.05 and higher versions

...

Undocumented -r switch:

Beginning with version 3.02, GNU sed has an undocumented -r switch (undocumented till version 4.0), activating Extended Regular Expressions in the following manner:

 ?      -  0 or 1 occurrence of previous character
 +      -  1 or more occurrences of previous character
 |      -  matches the string on either side, e.g., foo|bar
 (...)  -  enable grouping without backslash
 {...}  -  enable interval expression without backslash

When the -r switch (mnemonic: "regular expression") is used, prefix these symbols with a backslash to disable the special meaning.


For documentation of regular expression syntax used in (GNU) sed, see Overview of basic regular expression syntax

5.3 Overview of basic regular expression syntax

...

\(regexp\)

Groups the inner regexp as a whole, this is used to:

  • Apply postfix operators, like (abcd)*: this will search for zero or more whole sequences of ‘abcd’, while abcd* would search for ‘abc’ followed by zero or more occurrences of ‘d’. Note that support for (abcd)* is required by POSIX 1003.1-2001, but many non-GNU implementations do not support it and hence it is not universally portable.
  • Use back references (see below).

Upvotes: 2

Related Questions