md-86
md-86

Reputation: 83

How to parse a XML in Erlang?

I have this string with XML extract in a tuple list:

MessageResponse = [{"code",0},{"description","description"},{"respuestaServicioSoap",{{"executeWebServiceSolutionResult",{{"CEDULARUCSpecified", false},{"AUTORIZACION", "00000012431781"},{"AUTORIZACIONSpecified",true},{"RESULTADO","000"},{"CODIGO_RESULTADOSpecified",true},{"COD_PAGO","00000012431781"},{"COD_PAGOSpecified",true},{"COMISION",{{"string","0"}}},{"COMISIONSpecified", true},{"DIRECCIONSpecified", false},{"FECHA_COMPENSACIONSpecified", false},{"FECHA_TRANSACCION","20170116"},{"FECHA_TRANSACCIONSpecified",true},{"FECHORA_SW","20170116123951"},{"FECHORA_SWSpecified",true},{"HORA_TRANSACCION","123951"},{"HORA_TRANSACCIONSpecified",true},{"MENSAJE","TRANSACCION OK"},{"MENSAJESpecified",true},{"NOMBRESpecified",false},{"PRODUCTO","0010761005"},{"PRODUCTOSpecified",true},{"SECUENCIA_ADQ","2833"},{"SECUENCIA_ADQSpecified",true},{"SECUENCIA_SW","576167"},{"SECUENCIA_SWSpecified",true},{"TERMINAL","0696069603000001"},{"TERMINALSpecified",true},{"TYPE_TRNSpecified",false},{"VALOR_TOTAL", { {  "string",  "0" }}},{"VALOR_TOTALSpecified",true},{"XML_ADDSpecified",false},{"XML_DATASpecified",false},{"XML_FACT","<XML_FACT>\r\n  <DATOS_FACT>\r\n    <LINEA_1>REPRESENTACIONES ORMAN S.A.</LINEA_1>\r\n    <LINEA_2>RUC: 0987654321</LINEA_2>\r\n    <LINEA_3 />\r\n    <LINEA_4 />\r\n    <LINEA_5>FACTURA: 001-627-0000048745</LINEA_5>\r\n    <LINEA_6>CLAVE: </LINEA_6>\r\n    <LINEA_7>COMISION POR SERVICIO</LINEA_7>\r\n    <LINEA_8>RECAUDACION EEAAPP - CUENTA: 11223344</LINEA_8>\r\n    <LINEA_12>FACTURA: 001-627-0000048745 - CONSULTE SU DOCUMENTO EN WWW.LITO.COM/DOCUMENTOSELECTRONICOS</LINEA_12>\r\n    <MSGCOMP />\r\n    <MSGFACT />\r\n  </DATOS_FACT>\r\n</XML_FACT>"},{"XML_FACTSpecified",true},{"XML_REPLY_CONSULTASpecified",false},{"XML_REPLY_PAGOSSpecified",false}}},{"executeWebServiceSolutionResultSpecified", true}}},{"result", "ok"}]

and need to get the text in LINEA_5 tag.

How can this be done?

Upvotes: 1

Views: 2375

Answers (2)

Pascal
Pascal

Reputation: 14042

the OTP library xmerl provides all the functions to manipulate XML files or string. It provides a set of record that help to handle different elements.

documentation is available here

The records are defined in erlXX/lib/xmerl-YYY/include/xmerl.hrl:

  • #xmlText{}
  • #xmlElement{}
  • #xmlPI{}
  • #xmlComment{}
  • #xmlDecl{}

[edit]

The xml data that you provide in your example is already modified, so I take an example from my own. Consider an xml file with the content:

<?xml version="1.0"  encoding="UTF-8"?> <package xmlns="http://www.idpf.org/2007/opf" version="2.0" unique-identifier="uuid_id">
    <metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:opf="http://www.idpf.org/2007/opf" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:calibre="http://calibre.kovidgoyal.net/2009/metadata" xmlns:dc="http://purl.org/dc/elements/1.1/">
    <dc:creator opf:role="aut" opf:file-as="Ahern, Cecelia">Cecelia Ahern</dc:creator>
    <dc:publisher>J'ai Lu</dc:publisher>
    <meta name="calibre:title_sort" content="Si tu me voyais maintenant"/>
    <dc:description>description blah blah</dc:description>
    <meta name="calibre:timestamp" content="2012-03-18T18:04:20+00:00"/>
    <dc:title>Si tu me voyais maintenant</dc:title>
    <meta name="cover" content="cover"/>
    <dc:date>2012-03-18T18:04:23+00:00</dc:date>
    <dc:contributor opf:role="bkp">calibre (0.8.42) [http://calibre-ebook.com]</dc:contributor>
    <dc:identifier opf:scheme="ISBN">9782290006504</dc:identifier>
    <dc:identifier id="uuid_id" opf:scheme="uuid">7d062b17-258e-4268-9d46-a753c063c969</dc:identifier>
    <dc:subject>Chick-lit</dc:subject>
    <meta name="calibre:user_categories" content="{}"/>
    <meta name="calibre:author_link_map" content="{&quot;Cecelia Ahern&quot;: &quot;&quot;}"/>
    <dc:language>fr</dc:language>
    </metadata>
    <manifest>
        <item href="cover.jpeg" id="cover" media-type="image/jpeg"/>
    </manifest>
    <spine toc="ncx">
        <itemref idref="titlepage"/>
    </spine>
    <guide>
        <reference href="titlepage.xhtml" type="cover" title="Cover"/>
    </guide> </package>

It is extract from an epub book, and stored in a file "content.opf". If I want to get the author name (line 4) I can do:

1> rr("C:\\My programs\\erl8.2\\lib\\xmerl-1.3.12\\include\\xmerl.hrl").  
2> {Xml,_} = xmerl_scan:file("../doc/content.opf"),                 
2> Content = Xml#xmlElement.content,                                
2> [MetaRec] = [X || X <- Content, X#xmlElement.name == metadata],  
2> Meta = MetaRec#xmlElement.content,                               
2> [CreatRec] = [X || X <- Meta, X#xmlElement.name == 'dc:creator'],   
2> Creat = CreatRec#xmlElement.content,                             
2> [CreatText] = [X || X <- Creat, is_record(X,xmlText)],           
2> Aut = CreatText#xmlText.value.                                   
"Cecelia Ahern"

Upvotes: 2

md-86
md-86

Reputation: 83

with this code:

{Xml, _Rest} = xmerl_scan:string(XmlFactura).
[#xmlText{value=Linea5}] = xmerl_xpath:string("//LINEA_5/text()", Xml).

Upvotes: 1

Related Questions