Reputation: 3
I'm trying to import the most common Library of Congress identifier for a list of books using Google Sheets. The XML file for the ISBN is http://classify.oclc.org/classify2/Classify?isbn=1433528525&summary=true. XML is pasted below for ease. I want to get lcc/mostPopular[@nsfa] but the formula =importxml("http://classify.oclc.org/classify2/Classify?isbn=1433528525&summary=true","lcc/mostPopular[@nsfa]")
returns "Imported content is empty."
Am I entering the xpath_query wrong?
I know the link is valid because I can import the whole thing with =importdata("http://classify.oclc.org/classify2/Classify?isbn=1433528525&summary=true")
, but that gives a garbled mess of data in the spreadsheet.
<classify xmlns="http://classify.oclc.org">
<response code="0"/>
<!-- Classify is a product of OCLC Online Computer Library Center: http://classify.oclc.org -->
<work author="Piper, John, 1946-" editions="4" eholdings="97" format="Book" holdings="184" itemtype="itemtype-book" owi="769061307" title="Bloodlines : race, cross, and the Christian">696100305</work>
<authors>
<author lc="n78072014" viaf="109537817">Piper, John, 1946-</author>
</authors>
<orderBy>thold desc</orderBy>
<input type="isbn">1433528525</input>
<recommendations>
<ddc>
<mostPopular holdings="280" nsfa="270.089" sfa="270.089"/>
<mostRecent holdings="280" sfa="270.089"/>
<latestEdition holdings="280" sf2="22" sfa="270.089"/>
</ddc>
<lcc>
<mostPopular holdings="280" nsfa="BT738.27" sfa="BT738.27"/>
<mostRecent holdings="280" sfa="BT738.27"/>
</lcc>
</recommendations>
</classify>
Upvotes: 0
Views: 215
Reputation: 201358
I think that in your formula of =importxml("http://classify.oclc.org/classify2/Classify?isbn=1433528525&summary=true","lcc/mostPopular[@nsfa]")
, it is required to modify the xpath. In this answer, I would like to propose to modify the xpath for achieving your goal. So, how about the following modification?
=IMPORTXML(A1,"//*[local-name()='lcc']//@nsfa")
http://classify.oclc.org/classify2/Classify?isbn=1433528525&summary=true
to the cell "A1".nsfa
is existing only one under the tag lcc
. So I think that you can use //*[local-name()='lcc']//@nsfa
as the xpath.lcc/mostPopular[@nsfa]
, you can also use the xpath of //*[local-name()='lcc']/*[local-name()='mostPopular']/@nsfa
.Upvotes: 1
Reputation: 1
try:
=REGEXEXTRACT(QUERY(FLATTEN(SPLIT(QUERY(IMPORTDATA(
"http://classify.oclc.org/classify2/Classify?isbn=1433528525&summary=true"),,9^9),
"<lcc>", 0)),
"where Col1 contains 'mostPopular' offset 1"),
"nsfa=""([^\s]+)""")
Upvotes: 0