Reputation: 855
I have an XML
file that I am finding and replacing emails
and usernames
in.
It's all good but to avoid some duplicate user emails etc.. I am wanting to skip XML elements of specific types.
I can do this if I want to skip ONE specific time i.e.
/ApplicationUser/!s/"user.name"/"[email protected]"/g
But not if I try multiple on the one sed command
/(OtherElement|ApplicationUser)/!s/"user.name"/"[email protected]"/g
OR
/\(OtherElement\|ApplicationUser\)/!s/"user.name"/"[email protected]"/g
OR
/\(OtherElement|ApplicationUser\)/!s/"user.name"/"[email protected]"/g
I am loading in the commands from a file if that is relevant. I'm assuming it has something to do with my pattern at the start trying to match 1 or more words but not sure.
Upvotes: 0
Views: 99
Reputation: 6937
So, the regular expression syntax depends on the version of sed you're using.
First off, according to the POSIX specification, basic regular expressions (BRE) do not support alternation. However, tools do not necessarily follow the specification and, in particular, different versions of sed have different behavior.
The examples below are all processing this file:
$ cat sed-re-test.txt
OtherElement "user.name"
OnlyReplaceMe "user.name"
ApplicationUser "user.name"
The GNU sed BRE variant supports alternation but the |
metacharacter (along with (
and )
) must be escaped with a \
. If you use -E
flag to enable Extended Regular Expressions (ERE), then the metacharacters must not be escaped.
$ sed --version
sed (GNU sed) 4.4
<...SNIP...>
GNU sed BRE variant (with escaped metacharacters): WORKS
$ cat sed-re-test.txt | sed '/\(OtherElement\|ApplicationUser\)/!s/"user.name"/"[email protected]"/g'
OtherElement "user.name"
OnlyReplaceMe "[email protected]"
ApplicationUser "user.name"
GNU sed ERE (with unescaped metacharacters): WORKS
$ cat sed-re-test.txt | sed -E '/(OtherElement|ApplicationUser)/!s/"user.name"/"[email protected]"/g'
OtherElement "user.name"
OnlyReplaceMe "[email protected]"
ApplicationUser "user.name"
BSD sed does not support alternation in BRE mode. You must use -E
to enable alternation support.
No --version
flag, so identifying the OS will have to do:
$ uname -s
OpenBSD
BSD sed BRE (with escaped and unescaped metacharacters): DOES NOT WORK
$ cat sed-re-test.txt | sed '/\(OtherElement\|ApplicationUser\)/! s/"user.name"/"[email protected]"/'
OtherElement "[email protected]"
OnlyReplaceMe "[email protected]"
ApplicationUser "[email protected]"
$ cat sed-re-test.txt | sed '/(OtherElement|ApplicationUser)/! s/"user.name"/"[email protected]"/'
OtherElement "[email protected]"
OnlyReplaceMe "[email protected]"
ApplicationUser "[email protected]"
BSD sed ERE (with unescaped metacharacters): WORKS
$ cat sed-re-test.txt | sed -E '/(OtherElement|ApplicationUser)/! s/"user.name"/"[email protected]"/'
OtherElement "user.name"
OnlyReplaceMe "[email protected]"
ApplicationUser "user.name"
Upvotes: 1
Reputation: 204258
Just use awk and avoid the convoluted, backwards logic (if X do NOT do Y but do Y for everything else
vs the simple if NOT X do Y
) and the version-specific constructs that you get with sed.
awk '!/OtherElement|ApplicationUser/{ gsub(/"user.name"/,"\"[email protected]\"") } 1' file
That is clear, simple, extensible and will work with any awk in any shell on any UNIX box.
Upvotes: 0
Reputation: 58488
This might work for you (GNU sed):
sed '/OtherElement\|ApplicationUser/b;s/"user.name"/"[email protected]"/g file
On encountering a line which you do not want to process, break out, fetch the next and repeat.
Upvotes: 0