Abhi
Abhi

Reputation: 3

How to import data with IMPORTXML (have namespaces) using xpath in Google sheets?

I am trying use IMPORTXML on google sheets on https://travel.rakuten.co.jp/HOTEL/40130/review.html to get the review for each category, like food/ service but I am stuck. Reading online I realised that this webpage has NAMESPACES on html.

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="ja" lang="ja" dir="ltr" xmlns:og="http://ogp.me/ns#" >

The specific section is

<dd>
<ul>
<li><span class="name">サービス</span><span class="rate rate35">3.43</span></li>
<li><span class="name">立地</span><span class="rate rate40">3.79</span></li>
<li><span class="name">部屋</span><span class="rate rate35">3.14</span></li>
<li><span class="name">設備・アメニティ</span><span class="rate rate35">3.43</span></li>
<li><span class="name">風呂</span><span class="rate rate40">3.64</span></li>
<li><span class="name">食事</span><span class="rate rate35">3.50</span></li>
</ul>
</dd>

I am using this is XPATH so far but it does not work.

//*[local-name()='span'][@class='name'][text()='立地']/following-sibling::*/*[local-name() = 'span']

The Google sheet can be found here: https://docs.google.com/spreadsheets/d/1EhZhyhhVyUHQJ1FOTSSyWnqtmD9zKRRGaAlvKcFMn4g/edit#gid=1848100649

Upvotes: 0

Views: 654

Answers (1)

E.Wiest
E.Wiest

Reputation: 5905

Rakuten seems to block GoogleSheets when it comes to downloading the data. You can use IMPORTFROMWEB addon to retrieve the data :

Rakuten

Xpaths used :

//dl[contains(.,"項目別の評価")]//span[1] 
//dl[contains(.,"項目別の評価")]//span[2]

Formula :

=IMPORTFROMWEB(B1;B2:C2)

Notes : number of requests are limited. Check the pricing or code your own GoogleAppScript.

You can also use the Rakuten API "SimpleHotelSearch" (you'll need an Application ID though). You can test your query frome here :

https://webservice.rakuten.co.jp/explorer/api/Travel/SimpleHotelSearch/

It will look like ("hotelNo" as parameter) :

https://app.rakuten.co.jp/services/api/Travel/SimpleHotelSearch/20170426?format=json&hotelNo=40130&applicationId=XXXXXXX

With this example it will fail. Returns :

"hotelRatingInfo": {
            "serviceAverage": 0,
            "locationAverage": 0,
            "roomAverage": 0,
            "equipmentAverage": 0,
            "bathAverage": 0,
            "mealAverage": 0

Upvotes: 1

Related Questions