user1796260
user1796260

Reputation: 297

Accessing an xml attribute in a bash script

I need to get an attribute from an xml file in a bash script but I can't use neither xmllint --xpath neither xmlstarlet cause they are not available on the server where I work.

I've tried solution with grep, cut and sed but it's not a good solution in a long time.

There is grep_xml available on the machine, I can acces elements with it but when I'm trying to access my attribute I get "error unrecognized expression in handler"

this is my xml file

<?xml version="1.0" standalone="yes"?>
<p4codeline xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="p4codeline_1_1.xsd">
  <module name="kpreader" currentVersion="kpreader_1.1STD0" previousVersion="kpreader_1.0STD0">
    <codeline owner="undefined" path="//HEP/jcas/kpreader/trunk/...">
      <namingConvention/>
      <description>Main codeline for development</description>
      <rules>
        <rule>Develop on MAIN, and create a TAG codeline on release</rule>
        <rule>Never broke the build on the MAIN</rule>
      </rules>
    </codeline>
    <externals>
      <external viewPath="J2ep_BuildTools/..." codeLine="//CT/JAVA/J2ep_BuildTools/Source/tags/J2EP_BUILDTOOLS_1.6STD0/..." depotPath="."/>
    </externals>
  </module>
</p4codeline>

and i would need to access to the path attribute in codeline only with a solution based on bash or command.

I've tried something like

xml_grep -t '/p4codeline/module/codeline/@path' file.xml

And it answear me

error: unrecognized expression in handler: '/p4codeline/module/codeline@path' at /usr/bin/xml_grep line 183

Upvotes: 0

Views: 2526

Answers (2)

AnJo
AnJo

Reputation: 72

You can basically use grep like xpath "sort of". For this scenario I would use

grep  -oP "(?<=[<]codeline)[^<]+" file.xml |  grep  -oP "(?<=path\=\")[^\"]*"

if you need to take into account an xpath like /p4codeline/module/codeline/@path. You will need to transform the file to one-line input and pipe to some grep statements

tr -d '\n' <  file.xml | grep -Eo '<p4codeline .+/p4codeline>' | grep -Eo '<module .+/module>' |  grep  -oP "(?<=[<]codeline)[^<]+" | grep  -oP "(?<=path\=\")[^\"]*"

Put it together in a little bash script

#!/usr/bin/env bash
if [ -p /dev/stdin ]
  then
    xmlinput=`cat | tr -d '\n'`
else 
 xmlinput=`tr -d '\n' < ${1}`
fi

if [[ ${2} == "" ]]
 then
   ajpath="${1}/EOP"
else
  ajpath="${2}/EOP"
fi

echo "${ajpath}" | tr '/' '\n' | while read i
 do
# if first letter is @ look for attribute
if [[ "$i" == "EOP" ]]
  then
   echo $xmlinput
   break
fi
if [[ "${i:0:1}" == "@" ]]
  then
   xmlinput=`echo $xmlinput | grep  -oP "(?<=${i#?}\=\")[^\"]*"`
  else
  if [[ "$i" != "" ]]
   then
   xmlinput=`echo $xmlinput | grep -Eo "<${i}.+/${i}>"`
  fi
fi
done

You execute the script like anjopath.sh file.xml /p4codeline/module/codeline/@path

will give you //HEP/jcas/kpreader/trunk/...

anjopath.sh file.xml /p4codeline/module/@name

will give you kpreader

anjopath.sh file.xml /p4codeline/module/codeline/rules

will give you

> <rules> <rule>Develop on MAIN, and create a TAG codeline on
> release</rule> <rule>Never broke the build on the MAIN</rule> </rules>

you can also pipe xmlinput like this

cat file.xml | anjopath.sh /p4codeline/module/codeline/rules

Upvotes: 0

Zombo
Zombo

Reputation: 1

As these things usually go, this command is highly dependent on your input

$ awk '/path/ {print $4}' FS='"' file.xml
//HEP/jcas/kpreader/trunk/...

Upvotes: 1

Related Questions