Reputation: 45
In xml file I am searching the sting "<file:write" and xml file has a complete xml tag and with in that tag it has the value field. I am trying to fetch the value filed in csv file with file name. The problem is that the field (path= and Path=) is either 2 or 3 or 4th column and I am not able to use the cut command.
Is there a better way of doing this?
find /opt/mortagage/application.xml -type f -exec egrep -ri "<file:write" /dev/null {} + |uniq| sed '/<!--.*-->/d' | sed '/<!--/,/-->/d'
/opt/mortagage/application.xml: <file:write doc:id="16630" path="${file.location}" doc:name="Save file to directory">
/opt/mortagage/application.xml: <file:write doc:name="Write to complete folder" doc:id="18890" path='#["${file.completeLocation}" ++ vars.zipFileName]' config-ref="File_Config_completed">
/opt/mortagage/application.xml: <file:write doc:name="Write to complete folder" doc:id="19990" Path='#["${file.completeLocation}" ++ vars.zipFileName]' config-ref="File_Config_completed">
Upvotes: -1
Views: 64
Reputation: 36471
A "better way" would be to use dedicated processors for structured data, in this case a command-line XML processor could do it easily.
Using kislyuk/yq:
xq -r '.. | ."file:write"? | arrays[] // . | ."@path", ."@Path" | strings' in.xml
Using mikefarah/yq (which completely ignores namespaces):
yq -oy '.. | .write? | select(kind == "map") // .[] | ."+@path" // ."+@Path"' in.xml
Using xmlstarlet:
xmlstarlet sel -t -m '//file:write' -v '@path' -v '@Path' -n in.xml
Using libxml/xmllint:
xmllint
requires to either declare the actual namespaces (which you haven't provided in the sample), or to defect to ignoring them all by resorting to a local-name()
checkxmllint
also doesn't support the string(…)
function on multiple matches, so the best it can do is to output full attribute nodes like path="${file.location}"
. A workaround could be to subsequently use another tool (like awk
or sed
) to trim them down.xmllint --xpath '//*[local-name()="write"]/@path | //*[local-name()="write"]/@Path' in.xml |
sed 's/^.*\?="\|"$//g' # removes all up to the first =" and a final "
All of them output something like:
${file.location}
#["${file.completeLocation}" ++ vars.zipFileName]
#["${file.completeLocation}" ++ vars.zipFileName]
Upvotes: 0