David Oceans
David Oceans

Reputation: 37

How I can parser table values from google apps script

I'm using that in a cell but I have a lot of problems to get the information because many times the sheet is "loading" and does not load the information

=QUERY(IMPORTHTML("http://performance.morningstar.com/Performance/fund/trailing-total-returns.action?t=LU0266117414", "table", 1), "SELECT Col2, Col3, Col4, Col5, Col6, Col7, Col8, Col9 WHERE Col1='LU0266117414' LIMIT 1 LABEL Col2 '', Col3 '', Col4 '', Col5 '', Col6 '', Col7 '', Col8 '', Col9 '' ")

So, I was thinking that I would still have less problems if I read the fields of the URL table than with getcontents it reads it at the moment.

Like that

var url = http://performance.morningstar.com/Performance/fund/trailing-total-returns.action?t=LU0266117414
websiteContent = UrlFetchApp.fetch(url).getContentText()

The problem I am finding is that I can't extract only the fields from the first row. If it were a single value, I know how to do it, but several at the same time I have not achieved it

Do you know how I could do it?

Thank you very much

Upvotes: 1

Views: 107

Answers (1)

Tanaike
Tanaike

Reputation: 201768

I believe your goal as follows.

  • You want to retrieve the following values from the URL of http://performance.morningstar.com/Performance/fund/trailing-total-returns.action?t=LU0266117414.

    • -1.25 3.87 -3.97 -4.76 -0.82 54.78 29.64 28.68
  • You want to achieve this using Google Apps Script.

In this case, how about the following sample script? In this sample script, in order to parse the table, I used XmlService of Google Apps Script.

Sample script:

Please copy and paste the following script to the script editor of Google Spreadsheet, and run the function of myFunction() with the script editor. By this, the values are retrieved and put the values to the active sheet as the appending value.

function myFunction() {
  var url = "http://performance.morningstar.com/Performance/fund/trailing-total-returns.action?t=LU0266117414";
  var websiteContent = UrlFetchApp.fetch(url).getContentText();
  var trs = [...websiteContent.matchAll(/<tr class\="action">[\s\S\w]+?<\/tr>/g)];
  if (trs.length > 0) {
    var obj = XmlService.parse(trs[0][0].replace(/&mdash;/g, "-")); // Modified
    var values = obj.getRootElement().getChildren("td").map(e => {
      var v = e.getValue();
      return isNaN(v) ? v : Number(v);
    });
    values.splice(-2);
    SpreadsheetApp.getActiveSheet().appendRow(values)
  } else {
    throw new Error("Table cannot be retrieved.");
  }
}

Note:

  • When you want to use this script as the custom function, you can also the following script. In this case, please copy and paste the following script to the script editor of Spreadsheet, and save it. And, please put the formula of =SAMPLE("http://performance.morningstar.com/Performance/fund/trailing-total-returns.action?t=LU0266117414") to a cell. By this, the values are retrieved.

      function SAMPLE(url) {
        var websiteContent = UrlFetchApp.fetch(url).getContentText();
        var trs = [...websiteContent.matchAll(/<tr class\="action">[\s\S\w]+?<\/tr>/g)];
        if (trs.length > 0) {
          var obj = XmlService.parse(trs[0][0].replace(/&mdash;/g, "-")); // Modified
          var values = obj.getRootElement().getChildren("td").map(e => {
            var v = e.getValue();
            return isNaN(v) ? v : Number(v);
          });
          values.splice(-2);
          return [values];
        } else {
          throw new Error("Table cannot be retrieved.");
        }
      }
    
  • This sample script can be used for the URL of http://performance.morningstar.com/Performance/fund/trailing-total-returns.action?t=LU0266117414. So, when the structure of HTML is changed and/or you use this for other URLs, this script might not be able to be used. Please be careful this.

  • From your following replying,

    Thank you very much, looks great but there are a problem with url like that "http://performance.morningstar.com/Performance/fund/trailing-total-returns.action?t=IE00B42W3S00"

    I modified above script for both URLs of http://performance.morningstar.com/Performance/fund/trailing-total-returns.action?t=LU0266117414 and http://performance.morningstar.com/Performance/fund/trailing-total-returns.action?t=IE00B42W3S00.

References:

Upvotes: 2

Related Questions