Jonathan Ciecka
Jonathan Ciecka

Reputation: 3

Google Sheets =importXML() returns "Imported Content is Empty"

I'm trying to import the most common Library of Congress identifier for a list of books using Google Sheets. The XML file for the ISBN is http://classify.oclc.org/classify2/Classify?isbn=1433528525&summary=true. XML is pasted below for ease. I want to get lcc/mostPopular[@nsfa] but the formula =importxml("http://classify.oclc.org/classify2/Classify?isbn=1433528525&summary=true","lcc/mostPopular[@nsfa]") returns "Imported content is empty."

Am I entering the xpath_query wrong?

I know the link is valid because I can import the whole thing with =importdata("http://classify.oclc.org/classify2/Classify?isbn=1433528525&summary=true"), but that gives a garbled mess of data in the spreadsheet.

<classify xmlns="http://classify.oclc.org">
<response code="0"/>
<!-- Classify is a product of OCLC Online Computer Library Center: http://classify.oclc.org -->
<work author="Piper, John, 1946-" editions="4" eholdings="97" format="Book" holdings="184" itemtype="itemtype-book" owi="769061307" title="Bloodlines : race, cross, and the Christian">696100305</work>
<authors>
<author lc="n78072014" viaf="109537817">Piper, John, 1946-</author>
</authors>
<orderBy>thold desc</orderBy>
<input type="isbn">1433528525</input>
<recommendations>
<ddc>
<mostPopular holdings="280" nsfa="270.089" sfa="270.089"/>
<mostRecent holdings="280" sfa="270.089"/>
<latestEdition holdings="280" sf2="22" sfa="270.089"/>
</ddc>
<lcc>
<mostPopular holdings="280" nsfa="BT738.27" sfa="BT738.27"/>
<mostRecent holdings="280" sfa="BT738.27"/>
</lcc>
</recommendations>
</classify>

Upvotes: 0

Views: 215

Answers (2)

Tanaike
Tanaike

Reputation: 201358

I think that in your formula of =importxml("http://classify.oclc.org/classify2/Classify?isbn=1433528525&summary=true","lcc/mostPopular[@nsfa]"), it is required to modify the xpath. In this answer, I would like to propose to modify the xpath for achieving your goal. So, how about the following modification?

Modified formula:

=IMPORTXML(A1,"//*[local-name()='lcc']//@nsfa")
  • In this formula, please put the URL of http://classify.oclc.org/classify2/Classify?isbn=1433528525&summary=true to the cell "A1".
  • In this case, the attribute nsfa is existing only one under the tag lcc. So I think that you can use //*[local-name()='lcc']//@nsfa as the xpath.
  • If you want to use lcc/mostPopular[@nsfa], you can also use the xpath of //*[local-name()='lcc']/*[local-name()='mostPopular']/@nsfa.

Result:

enter image description here

References:

Upvotes: 1

player0
player0

Reputation: 1

try:

=REGEXEXTRACT(QUERY(FLATTEN(SPLIT(QUERY(IMPORTDATA(
 "http://classify.oclc.org/classify2/Classify?isbn=1433528525&summary=true"),,9^9), 
 "<lcc>", 0)), 
 "where Col1 contains 'mostPopular' offset 1"), 
 "nsfa=""([^\s]+)""")

enter image description here

Upvotes: 0

Related Questions