Cao Doremi
Cao Doremi

Reputation: 91

IMPORTHTML() doesn't work in this webpage structure

I want to import data from a table within an HTML page. I applied Google's example IMPORTHTML("http://en.wikipedia.org/wiki/Demographics_of_India","table",4) to my case, changing various parameters, but it doesn't work after all.

Could anyone tell me what is wrong with my following formula?

=IMPORTHTML("https://s.cafef.vn/screener.aspx#data","table",1)

I want to import the table from https://s.cafef.vn/screener.aspx#data The first 2 rows of the table look like:

1 Công ty cổ phần 32 A32 UpCom -15.42 212.16 -100.00 6.91 4.52 0.00 31.20 2 Công ty Cổ phần Nhựa An Phát Xanh AAA HSX 0.00 3,426.19 -5.24 1.39 11.09 0.78 0.00 ...............

I tried raising the index of the table from 1 to 30 but they all don't work at all

=IMPORTHTML("https://s.cafef.vn/screener.aspx#data","table",2)
=IMPORTHTML("https://s.cafef.vn/screener.aspx#data","table",3)
...
=IMPORTHTML("https://s.cafef.vn/screener.aspx#data","table",30)

Thank you very much in advance Cao

Upvotes: 1

Views: 1532

Answers (1)

Tanaike
Tanaike

Reputation: 201758

I believe your goal as follows.

  • You want to retrieve the table from the URL of https://s.cafef.vn/screener.aspx#data.

Issue and workaround:

When I saw the HTML of the URL, it seems that the table is created by Javascript. In this case, the table cannot be directly retrieved by IMPORTHTML. This has already been mentioned by the comment.

Fortunately, when I saw the HTML, I noticed that the table for creating by Javascript is set as a JSON data. In this case, I thought that when the JSON data is retrieved, the values of table might be able to be retrieved using Google Apps Script. In this answer, as a workaround, I would like to propose to achieve your goal using Google Apps Script as a custom function.

Sample script:

Please copy and paste the following script to the script editor of Spreadsheet. And, please put a custom formula of =SAMPLE("https://s.cafef.vn/screener.aspx#data") to a cell. By this, the HTML data is retrieved and parsed it, and then, the table is retrieved from the JSON data.

function SAMPLE(url) {
  const res = UrlFetchApp.fetch(url, {muteHttpExceptions: true});
  const html = res.getContentText().match(/var jsonData \=([\S\s\w]+\}\])/);
  if (!html) return "No tables. Please confirm URL again.";
  const table = JSON.parse(html[1].replace(/\n/g, ""));
  const header = ["", "FullName", "Symbol", "CenterName", "ChangePrice", "VonHoa", "ChangeVolume", "EPS", "PE", "Beta", "Price"];
  return table.reduce((ar, e, i) => {
    const temp = header.map(f => f == "" ? i + 1 : e[f]);
    ar.push(temp);
    return ar;
  }, [header]);
}

Result:

When above script is run, the following result is obtained.

enter image description here

Note:

  • In this sample script, the header of const header = ["", "FullName", "Symbol", "CenterName", "ChangePrice", "VonHoa", "ChangeVolume", "EPS", "PE", "Beta", "Price"]; is used. This is for the table you want to retrieve. These are the keys for JSON data.
  • This sample script can be used for the current HTML of the URL of https://s.cafef.vn/screener.aspx#data. So when you change the URL and the site is updated and the HTML structure is changed, this script might not be able to be used. So please be careful this.

References:

Upvotes: 1

Related Questions