7ch5
7ch5

Reputation: 131

Wikitravel XML Tree Structure

I'm going through the Wikitravel API and I noticed that the XML file that they provide just lumps all of the information together in a big blob. Example: http://wikitravel.org/en/Special:Export/San_Francisco

Is there any way to obtain trees with the specific headings (e.g. Understand, Get In, Get Around, etc.) instead?

Upvotes: 2

Views: 721

Answers (1)

svick
svick

Reputation: 244878

You can use action=parse from the MediaWiki API to do this.

For example, the query http://wikitravel.org/wiki/en/api.php?format=xml&action=parse&prop=sections&page=San%20Francisco will return something like:

<api>
  <parse>
    <sections>
      <s toclevel="1" level="2" line="Districts" number="1" index="1" fromtitle="San_Francisco" byteoffset="1186" anchor="Districts"/>
      <s toclevel="1" level="2" line="Understand" number="2" index="2" fromtitle="San_Francisco" byteoffset="9563" anchor="Understand"/>
      <s toclevel="2" level="3" line="History" number="2.1" index="3" fromtitle="San_Francisco" byteoffset="9578" anchor="History"/>
      <s toclevel="2" level="3" line="Climate" number="2.2" index="4" fromtitle="San_Francisco" byteoffset="13913" anchor="Climate"/>
      <s toclevel="2" level="3" line="Literature" number="2.3" index="5" fromtitle="San_Francisco" byteoffset="16502" anchor="Literature"/>
      <s toclevel="2" level="3" line="Movies" number="2.4" index="6" fromtitle="San_Francisco" byteoffset="19404" anchor="Movies"/>
      <s toclevel="2" level="3" line="Tourist information" number="2.5" index="7" fromtitle="San_Francisco" byteoffset="23236" anchor="Tourist_information"/>
      <s toclevel="1" level="2" line="Talk" number="3" index="8" fromtitle="San_Francisco" byteoffset="24227" anchor="Talk"/>
      …
    </sections>
  </parse>
</api>

From this, you can reconstruct the section tree.

Upvotes: 3

Related Questions