Reputation: 288
I need to convert my XML file into the JSON.
A sample of XML script looks like:
<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="Overpass API 0.7.55.3 9da5e7ae">
<note>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</note>
<meta osm_base="2018-06-17T15:31:02Z"/>
<node id="330268305" lat="52.5475000" lon="13.3850775">
<tag k="direction" v="240-60"/>
<tag k="tourism" v="viewpoint"/>
<tag k="wheelchair" v="no"/>
</node>
<node id="330269757" lat="52.5473115" lon="13.3843131">
<tag k="direction" v="240-60"/>
<tag k="tourism" v="viewpoint"/>
<tag k="wheelchair" v="limited"/>
</node>
<way id="281307598">
<center lat="52.4934004" lon="13.4843019"/>
<nd ref="2852755795"/>
<nd ref="3772363803"/>
<nd ref="3772363802"/>
<nd ref="2852755796"/>
<nd ref="2852755797"/>
<nd ref="2852755798"/>
<nd ref="2852755795"/>
<tag k="man_made" v="tower"/>
<tag k="tourism" v="viewpoint"/>
<tag k="tower:type" v="observation"/>
<tag k="wheelchair" v="yes"/>
</way>
</osm>
The code that does the execution so far.
import xml.etree.ElementTree as ET
import json
input_file = r"D:\berlin\trial_xml\berlin_viewpoint_locations.xml"
tree = ET.parse(input_file)
root = tree.getroot()
result_list = [{k: (item.get(k) if k != 'extra' else
{i.get('k'): i.get('v') for i in item.iter('tag')})
for k in ('id', 'lat', 'lon', 'extra')}
for item in tree.findall("./node") + tree.findall('./way')]
print(result_list)
With the assistance of some Stackoverflow gurus, I have already achieved a half-done result. However, I still need to understand how to:
<center lat="52.4934004" lon="13.4843019"/>
in the same result_list for
nodes.
It works for
'id'` as was mentioned here.<nd ref="2852755795"/> <nd ref="3772363803"/>
, the same way as was done for extra
, e.g. nested list.Upvotes: 1
Views: 235
Reputation: 49842
The reason the current code is not working for you is that data structures are not the same. I would suggest independent parsers for each of the node
and way
types. You are already parsing the node
types so to parse the way
a fairly simple loop can be constructed like:
way_list = []
for item in tree.findall("./way"):
# get the center node
center = item.find('center')
# get the refs for the nd nodes
nds = [nd.get('ref') for nd in item.iter('nd')]
# construct a dict and append to result list
way_list.append(dict(
id=item.get('id'),
lat=center.get('lat'),
lon=center.get('lon'),
nds=nds,
extra={i.get('k'): i.get('v') for i in item.iter('tag')},
))
print(way_list)
[{
'id': '281307598',
'lat': '52.4934004',
'lon': '13.4843019',
'nds': ['2852755795', '3772363803', '3772363802', '2852755796',
'2852755797', '2852755798', '2852755795'],
'extra': {
'man_made': 'tower',
'tourism': 'viewpoint',
'tower:type': 'observation',
'wheelchair': 'yes'
}
}]
Upvotes: 1