Cao Doremi
Cao Doremi

Reputation: 91

Google sheets' IMPORTHTML() fails to keep the original format of the data

I want to use IMPORTHTML() to import data from a table at http://www.cophieu68.vn/atbottom.php; but IMPORTHTML() fails to keep the original format of the data, causing misleading information. Specifically, I enter =IMPORTHTML("http://www.cophieu68.vn/atbottom.php";"table"; 2) into A1 cell of Google sheets. Google sheets successfully imports Table 2, but wrongly adds symbol * into my data. Consequently, SVD becomes starSVDstar (Please note cell B2 on my screenshot below) or 9.3 become star9.3star (Please note cell C2). Moreover; 401,300 becomes 401,3 (Please note cell H2 ). Besides, 29.1 is understood as 29/01/2021 (Please note my mouse click at cell K2 on my screenshot below)

Here is the screenshot of the original table that i want to import:

https://drive.google.com/file/d/1wgJcO4O-ivpsXk0XetXzwdhSs4d_8Cjy/view?usp=sharing

and here is the screenshot of the table which is imported into my Google sheets

https://drive.google.com/file/d/1DLipA3o85MTGo2ktumTU_0A45vhuLSA6/view?usp=sharing

Could anyone tell me what is wrong with my formula and how can I fix this error? Thank you very much for your help.

Cao

Upvotes: 1

Views: 1197

Answers (2)

EBL
EBL

Reputation: 71

I suggest using regular expressions to solve the table import with a single formula:

=arrayformula(regexreplace(to_text(IMPORTHTML ("http://www.cophieu68.vn/atbottom.php ", "table", 2)),"\*(.*\/?)\*","$1"))

(I have added the function to_text to avoid losing the two zeros at the end of figures that are in units of thousands)

enter image description here

Upvotes: 1

CMB
CMB

Reputation: 5163

Explanation:

The source page HTML is controlled by JavaScript, so the raw output of =IMPORTHTML cannot be changed whatsoever. However, you can use this formula on a new sheet to remove the asterisks from the original table.

If your IMPORTHTML is in Sheet1 you can use this:

=ARRAYFORMULA(SUBSTITUTE(Sheet1!A2:M,"*",""))

Upvotes: 2

Related Questions