Reputation: 21
I am looking to remove "error_mail" and "succeed_mail" nodes from multiple similar XML files using sed or awk utilities .
Using sed , I was trying with below command ..but its not working
sed -i /<action name="succeed_mail">/,/<\/action>/d *.xml
Here is the sample file (test.xml) looks as below:-
Input XML File :- test.xml
<workflow>
<action name="start"
-----
-----
</action>
<action name="error_mail">
<email xmlns="uri:oozie:email-action:0.1">
<to>[email protected]</to>
<cc>[email protected]</cc>
<subject>Batch Failed</subject>
<body>Batch Failed at ${node}</body>
</email>
<ok to="killjob"/>
<error to="killjob"/>
</action>
<action name="succeed_mail">
<email xmlns="uri:oozie:email-action:0.1">
<to>[email protected]</to>
<cc>[email protected]</cc>
<subject>Batch Succeed</subject>
<body>Batch completed</body>
</email>
<ok to="end"/>
<error to="end"/>
</action></r>
</workflow>
--------Desired output :-
test.xml
<workflow>
<action name="start"
-----
-----
</action>
</workflow>
Upvotes: 1
Views: 312
Reputation: 1
Had a similar need. My process:
<tag>
to </tag>
in a new line of its owngrep -v tag
(or string as desired )xmllint --format
This method is quite generic.
To convert xml to a single line: tr -d '\n'
Csh script for step 2, accepts xml from piped stdin
>cat xmlsinglenewline
#!/bin/csh -f
# $1 is the tag
# Usage: <command> "tag"
sed "s/<$1/\n\<$1/g" | sed "s/<\/$1>/\<\/$1\>\n/g"
Caveat: Cannot handle nested (same) tag.
Upvotes: 0
Reputation: 203189
You didn't tell us in what way "it's not working" so I'm assuming you either don't know how to use |
in a regexp or don't know you have to quote your scripts.
With a sed that has -E
to enable EREs:
$ sed -E '/<action name="(succeed|error)_mail">/,/<\/action>/d' file
<workflow>
<action name="start"
-----
-----
</action>
</workflow>
or with any awk:
$ awk '/<action name="(succeed|error)_mail">/{f=1} !f; /<\/action>/{f=0}' file
<workflow>
<action name="start"
-----
-----
</action>
</workflow>
That is, of course, fragile and will fail for various other layouts of the same XML which is why use of XML-aware tools is always advised.
Upvotes: 0
Reputation: 133428
Experts always advice to use tools like xmlstarlet to parse xml files, since OP is using sed so coming up with this awk
solution. Fair warning this is written as per shown samples ONLY, in case you have something different this may not work.
awk '
/^ +<\/action>/ && foundSuccess{
foundSuccess=""
next
}
/^ +<\/action>/ && foundError{
foundError=""
next
}
/^ +<action name="error_mail">$/{
foundError=1
}
/^ +<action name="succeed_mail">/{
foundSuccess=1
}
NF && !foundError && !foundSuccess
' Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
/^ +<\/action>/ && foundSuccess{ ##Checking if line has </action> and variable foundSuccess is SET then do following.
foundSuccess="" ##Nullify variable foundSuccess here.
next ##next will skip all further statements from here.
}
/^ +<\/action>/ && foundError{ ##Checking if line has </action> and variable foundError is SET then do following.
foundError="" ##Nullify variable foundError here.
next ##next will skip all further statements from here.
}
/^ +<action name="error_mail">$/{ ##Checking if line starts with space and have <action name="error_mail">
foundError=1 ##Setting variable foundError to 1 here.
}
/^ +<action name="succeed_mail">/{ ##Checking if line starts with space and have <action name="succeed_mail">
foundSuccess=1 ##Setting variable foundSuccess to 1 here.
}
NF && !foundError && !foundSuccess ##Checking if line is NOT empty AND variable foundError AND variable foundSuccess is NOT set then print that line.
' Input_file ##Mentioning Input_file name here.
NOTE: To pass multiple xml files in place of Input_file use *.xml
to it, but this will not in place save. To perform in place save use GNU awk
, change awk to awk -i inplace
in above code. But its better to test it on few files and then run inplace option please for safer side. You could see this link how to do inplace editing with awk
with a backup of Input_file too https://stackoverflow.com/a/16529730/5866580
Upvotes: 0