Rob
Rob

Reputation: 51

Google Spreadsheet ImportXML xpath issue

I'm trying to import this XML feed in to a Google Spreadsheet: http://cloud.tfl.gov.uk/TrackerNet/LineStatus

but I can't seem to get the right xPath for the feed to be parsed.

I've tried:

=importxml("http://cloud.tfl.gov.uk/TrackerNet/LineStatus","/ArrayOfLineStatus")

and

=importxml("http://cloud.tfl.gov.uk/TrackerNet/LineStatus","/arrayoflinestatus//linestatus[1]@statusdetails")    

a few others but no luck. I've been using http://xmltoolbox.appspot.com/xpath_generator.html to try and get the Xpath.

Edit:

Here's a edited version of the XML:

<ArrayOfLineStatus xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"       xmlns:xsd="http://www.w3.org/2001/XMLSchema"  xmlns="http://webservices.lul.co.uk/">
<LineStatus ID="0" StatusDetails="">
<BranchDisruptions/>
<Line ID="1" Name="Bakerloo"/>
<Status ID="GS" CssClass="GoodService" Description="Good Service"     IsActive="true">
<StatusType ID="1" Description="Line"/>
</Status>    
<LineStatus ID="81" StatusDetails="">
<BranchDisruptions/>
<Line ID="81" Name="DLR"/>
<Status ID="GS" CssClass="GoodService" Description="Good Service" IsActive="true">
<StatusType ID="1" Description="Line"/>
</Status>
</LineStatus>
</ArrayOfLineStatus>    

Upvotes: 1

Views: 1805

Answers (2)

Wicket
Wicket

Reputation: 38160

The results of the formulas provided by the OP are empty, because the file use a namespace.

Note: IMPORTXML use xPath 1.0

Try the following:

  1. Create a new file.
  2. Add the following.

A1: The URL of the XML file.
A2: The xPath query. Double check that it will result will not be empty.
A3: =IMPORTXML(A1,A2)

Example 1: The following doesn't use the position instead of the node-name.

    +   ---------------------------------------------   +
    |                        A                          |
    +   ---------------------------------------------   +
  1 |   http://cloud.tfl.gov.uk/TrackerNet/LineStatus   |
  2 |   /*/*[position()<10]/@ID                         |
  3 |                                               0   |
  4 |                                               1   |
  5 |                                              10   |
  6 |                                               2   |
  7 |                                               8   |
  8 |                                               4   |
  9 |                                               9   |
 10 |                                               5   |
 11 |                                               6   |
    +   ---------------------------------------------   +

Example 2: The following use the a test predicate using the function local-name()

    +   ---------------------------------------------   +
    |                        A                          |
    +   ---------------------------------------------   +
  1 |   http://cloud.tfl.gov.uk/TrackerNet/LineStatus   |
  2 |   //*[local-name() = 'Line']/@Name                |
  3 |   Bakerloo                                        |
  4 |   Central                                         |
  5 |   Circle                                          |
  6 |   District                                        |
  7 |   Hammersmith and City                            |
  8 |   Jubilee                                         |
  9 |   Metropolitan                                    |
 10 |   Northern                                        |
 11 |   Piccadilly                                      |
 12 |   Victoria                                        |
 13 |   Waterloo and City                               |
 14 |   Overground                                      |
 15 |   TfL Rail                                        |
 16 |   DLR                                             |
 17 |   Trams                                           |
    +   ---------------------------------------------   +

See the above examples working: File

Upvotes: 1

user3616725
user3616725

Reputation: 3655

Your ImportXML() calls don't work because from google's point of view the output from the given url is not a well-formed XML. I encountered this problem many times, and as there are no tools/options you could use to pre-process the incoming data into a format that Google Spreadsheet will accept you have to give up.
(Please note I never said that there is anything wrong with your file. I think Google sheet's handling of xml is far from ideal, but not something I can fix)
Your options now are:

  1. use one of the many user-created worksheet scripts / functions available on the internet to parse your XML instead.
  2. Use one of the other import...() functions (for example =IMPORTDATA("http://cloud.tfl.gov.uk/TrackerNet/LineStatus")) and post-process the result
  3. use javascript version of the status checker and scrape that instead: (=IMPORTHTML("http://www.tfl.gov.uk/tfl/syndication/feeds/esubs/portrait-27.html","table",1))

Upvotes: 1

Related Questions