jason
jason

Reputation: 4429

google apps script Xmlservice

I'm updating some code that used to use Xml.parse to parse this page https://www.sec.gov/cgi-bin/browse-edgar?company=&CIK=&type=8-k&owner=exclude&count=100&action=getcurrent

The old code uses Xml to get the table like... this

var pageTxt = UrlFetchApp.fetch(target).getContentText();
var pageDoc = Xml.parse(pageTxt,true);
var table = pageDoc.getElement().body.div.table

Xml is deprecated.

function myFunction() {
  var rss = "https://www.sec.gov/cgi-bin/browse-edgar?action=getcurrent&CIK=&type=8-k&company=&dateb=&owner=include&start=0&count=40&output=atom"
  var r = UrlFetchApp.fetch(rss).getContentText()
  var doc = XmlService.parse(r)
  var atom = XmlService.getNamespace(rss)
  var table = doc.getRootElement().getChildren('summary', atom)

}

Now I'm trying to use XmlService and having trouble getting that same table

Can some one help with the code to get the table? I need an array to loop through and go through each tr and td. Thanks.

Upvotes: 0

Views: 1442

Answers (1)

Tanaike
Tanaike

Reputation: 201378

I believe your goal as follows.

  • You want to retrieve the values from entry of the XML data and want to put the values to the Spreadsheet using Google Apps Script.

Modification points:

  • When I saw the data from the URL of https://www.sec.gov/cgi-bin/browse-edgar?action=getcurrent&CIK=&type=8-k&company=&dateb=&owner=include&start=0&count=40&output=atom, I confirmed that the data is the XML data.
  • When I saw your script, it seems that entry is not retrieved.

Modified script:

function myFunction() {
  // Retrieve the XML data from URL.
  var rss = "https://www.sec.gov/cgi-bin/browse-edgar?action=getcurrent&CIK=&type=8-k&company=&dateb=&owner=include&start=0&count=40&output=atom"
  var r = UrlFetchApp.fetch(rss).getContentText()

  // Parse the XML data.
  var doc = XmlService.parse(r)
  var root = doc.getRootElement();
  var ns = root.getNamespace();
  var entries = root.getChildren("entry", ns);
  var header = ["title", "link", "summary", "updated", "id"];
  var ar = entries.map(e => header.map(f => e.getChild(f, ns).getValue().trim()));
  ar.unshift(header);
  
  // Put the value to the active sheet.
  var sheet = SpreadsheetApp.getActiveSheet();
  sheet.getRange(1, 1, ar.length, ar[0].length).setValues(ar);
}
  • When you want to retrieve only summary, please modify var header = ["title", "link", "summary", "updated", "id"]; to var header = ["summary"];

Note:

  • I thought that in your situation, the built-in formula of IMPORTXML can be also used, in this case, how about the following formula?

      =IMPORTXML("https://www.sec.gov/cgi-bin/browse-edgar?action=getcurrent&CIK=&type=8-k&company=&dateb=&owner=include&start=0&count=40&output=atom","//*[local-name()='entry']")
    
  • Or, when you want to retrieve the values of summary, you can also use the following formula.

      =IMPORTXML("https://www.sec.gov/cgi-bin/browse-edgar?action=getcurrent&CIK=&type=8-k&company=&dateb=&owner=include&start=0&count=40&output=atom","//*[local-name()='summary']")
    

References:

Upvotes: 2

Related Questions