martin222
martin222

Reputation: 23

XML parsing google app script

I have a problem with function XmlService.parse in Google App Script. I am trying to create script, and I need to parse emails which I have in inbox. I tried to send several tests email which have this format

<div dir="ltr">test 1<div><br></div></div>

but if I use this line

var doc = XmlService.parse(messages[j].getBody());

I get this error

Error on line 1: The element type "br" must be terminated by the matching end-tag "". (line 18, file "Code")

What is recognizably beacuse there is only
in message. Is there any solution how to solve this problem? Or I have to use another way how to parse it? Thank you in advance.

edit: I have the same problem with img tag

Error Occured: Error on line 38: The element type "img" must be terminated by the matching end-tag "".

I need to parse the text which is in the red frame email to parse

In old script there was a function

Xml.parse(messag.getBody(),true)

however this function is deprecated. I tried to use

XmlService.parse(messages.getBody());

which I mentioned but I get errors with unpaired html tags. The message which I get by function .getBody() is here getbody email

Could someone help me? Thanks once more.

Upvotes: 1

Views: 1792

Answers (1)

Spencer Easton
Spencer Easton

Reputation: 5782

XmlService can not parse HTML. It can only parse Canonical XML. But there are html parsing libraries for node JS. So you can take one of those modules run it through browserify, make a minor modification to the generated source, and get a Apps Script library that parses html.

https://github.com/fb55/htmlparser2

My generated library:

1TLbGgQBCztnB0lOhcTYKg2UpXtpdDwocvfcx44w1tqFnHDJC5ZXy_BDo
https://github.com/Spencer-Easton/Apps-Script-htmlparser2-library

Example code modified from htmlparser2 readme:

function myFunction() {   
  var htmlparser = htmlparser2.init();
  var parser = new htmlparser.Parser({
    onopentag: function(name, attribs){
      if(name === "div"){
        Logger.log("found div");
      }
    },
    ontext: function(text){
      Logger.log("-->" + text);
    },
    onclosetag: function(tagname){
      if(tagname === "div"){
        Logger.log("End Div");
      }
    }
  }, {decodeEntities: true});
  parser.write('<div dir="ltr">test 1<div><br></div></div>');
  parser.end();  
}

Upvotes: 3

Related Questions