Reputation: 3
I am trying to import the table of the most recently received Gazettes into this spreadsheet. I've used the following formula to do it:
=query(IMPORTHTML("https://discover.sabinet.co.za/government_gazette_current_index", "table", 1), "Select Col2,Col3,Col4,Col5")
The problem is that when the data comes through, it excludes the urls which the titles are hyperlinked to. Is there any way that I can get the urls to pull through as well?
I've unsuccessfully tried to use an IMPORTXML formula to do it:
`=IMPORTXML("https://discover.sabinet.co.za/government_gazette_current_index","//tr/td[1]/a/@href")`
Upvotes: 0
Views: 576
Reputation: 15318
I don't exactly what you want ... try this : in A1 = url, in A2
=importxml(A1,"//@href[contains(.,'rgg_gnum')]")
and in B2
=arrayformula(if(A2:A="",,"https://discover.sabinet.co.za/"&A2:A))
https://docs.google.com/spreadsheets/d/11l2mDXV-hrcbZQmNf6Bs9AUgOCZg0r-Y0DVdZiacuB8/edit?usp=sharing
Upvotes: 1