Scott Wilton
Scott Wilton

Reputation: 253

Why does sed provide an "invalid content" error on linux but not on mac

I have the following sed extended regular expressions replacement inside a bash script:

sed -i.bak -E 's~^[[:blank:]]*\\iftoggle{[[:alnum:]_]+}{\\input{([[:alnum:]_\/]+)}}{}~\\input{\1}~' file.txt

which should replace strings like

\iftoggle{xx_yy}{\input{xx_yy/zz}}{}

with

\input{xx_yy/zz}

inside file.txt.

This works just fine locally, on OS X, but the script needs to be POSIX. Specifically, it fails on my remote Travis CI build (which uses Linux). While sed -E is not documented for GNU sed, it behaves just like sed -r and seems to work fine, allowing for a POSIX version of sed with extended regular expressions.

The error given is:

sed: -e expression #1, char 81: Invalid content of \{\}

I'm also not sure where the error starts counting characters from, whether it's the beginning of the line, or only that part which is encased in quotes (the expression)?

Upvotes: 5

Views: 3435

Answers (2)

SLePort
SLePort

Reputation: 15461

You don't need ERE here. Using BRE:

sed i.bak 's~^[[:blank:]]*\\iftoggle{[[:alnum:]_][[:alnum:]_]*}{\\input{\([[:alnum:]_\/][[:alnum:]_\/]*\)}}{}~\\input{\1}~' file.txt

{ don't need to be escaped here but ( do.

As + is not part of the BRE, you can replace [[:alnum:]_]+ with [[:alnum:]_][[:alnum:]_]* or with [[:alnum:]_]\{1,\}.

And as a side note, \+ can be used with GNU sed in BRE but keep in mind that it's not portable, it's a GNU extension.

Upvotes: 6

Inian
Inian

Reputation: 85550

This does not directly answer the question with sed, but provides an alternate simpler way to do this in perl command-line regex search and replacement.

perl -p -e 's|\iftoggle\{(\w+)\}\{\\input\{(\w+)/(\w+)\}\}\{\}|\input\{\2/\3\}|g' file
\input{xx_yy/zz}

Using the word-separator as | and \w+ to match the [[:alnum:]] characters.

For in-place replacement, use the -i flag similar to sed

perl -p -i.bak -e 's|\iftoggle\{(\w+)\}\{\\input\{(\w+)/(\w+)\}\}\{\}|\input\{\2/\3\}|g' file

Regarding Word-characters(\w) in perl POSIX character classes page,

Word characters

A \w matches a single alphanumeric character (an alphabetic character, or a decimal digit); or a connecting punctuation character, such as an underscore ("_"); or a "mark" character (like some sort of accent) that attaches to one of those. It does not match a whole word. To match a whole word, use \w+ . This isn't the same thing as matching an English word, but in the ASCII range it is the same as a string of Perl-identifier characters.

For an input-with multiple folders inside input, e.g.

cat file
\iftoggle{xx_yy}{\input{xx_yy/zz_yy_zz_kk/dude_hjgk}}{}

perl -p -e 's|\iftoggle\{(\w+)\}\{\\input\{(\w+)/(\w+)/(\w+)\}\}\{\}|\input\{\2/\3/\4\}|g' file
\input{xx_yy/zz_yy_zz_kk/dude_hjgk}

Just plug and play as many as capturing groups you want.

Upvotes: 1

Related Questions