Taras
Taras

Reputation: 276

Convert XML into dictionary

I need to convert XML file into the dictionary (later on it will be converted into JSON).

A sample of XML script looks like:

<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="Overpass API 0.7.55.3 9da5e7ae">
<note>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</note>
<meta osm_base="2018-06-17T15:31:02Z"/>
...
  <node id="2188497873" lat="52.5053306" lon="13.4360114">
    <tag k="alt_name" v="Spreebalkon"/>
    <tag k="name" v="Brommybalkon"/>
    <tag k="tourism" v="viewpoint"/>
    <tag k="wheelchair" v="yes"/>
  </node>
...
</osm>

With the simple code I have already filtered all the values that I needed for my dictionary:

Code

import xml.etree.ElementTree as ET

input_file = r"D:\berlin\trial_xml\berlin_viewpoint_locations.xml"

tree = ET.parse(input_file)
root = tree.getroot()

lst1 = tree.findall("./node")
for item1 in lst1:
    print('id:',item1.get('id'))
    print('lat:',item1.get('lat'))
    print('lon:',item1.get('lon'))
    for item1_tags_and_nd in item1.iter('tag'):
        print(item1_tags_and_nd.get('k') + ":", item1_tags_and_nd.get('v'))

Result

id: 2188497873
lat: 52.5053306
lon: 13.4360114
alt_name: Spreebalkon
name: Brommybalkon
tourism: viewpoint
wheelchair: yes

Can you help me, please to append properly and efficiently these values into a dictionary?

I want it to look like:

{'id': '2188497873', 'lat': 52.5053306, 'lon': 13.4360114, 'alt_name': 'Spreebalkon', 'name': 'Brommybalkon', 'tourism': 'viewpoint', 'wheelchair': 'yes'}

I have tried with

dictionary = {}
dictionary['id'] = []
dictionary['lat'] = []
dictionary['lon'] = []
lst1 = tree.findall("./node")
for item1 in lst1:
    dictionary['id'].append(item1.get('id'))
    dictionary['lat'].append(item1.get('lat'))
    dictionary['lon'].append(item1.get('lon'))
    for item1_tags_and_nd in item1.iter('tag'):
       dictionary[item1_tags_and_nd.get('k')] = item1_tags_and_nd.get('v')

but it does not work so far.

Upvotes: 2

Views: 4375

Answers (1)

Stephen Rauch
Stephen Rauch

Reputation: 49784

I suggest you construct a list of dicts, instead of a dict of lists like:

result_list = []
for item in tree.findall("./node"):
    dictionary = {}
    dictionary['id'] = item.get('id')
    dictionary['lat'] = item.get('lat')
    dictionary['lon'] = item.get('lon')
    result_list.append(dictionary)

Or as a couple of comprehensions like:

result_list = [{k: item.get(k) for k in ('id', 'lat', 'lon')}
               for item in tree.findall("./node")]

And for the nested case:

result_list = [{k: (item.get(k) if k != 'tags' else
                    {i.get('k'): i.get('v') for i in item.iter('tag')})
                for k in ('id', 'lat', 'lon', 'tags')}
               for item in tree.findall("./node")]

Results:

{
    'id': '2188497873', 
    'lat': '52.5053306', 
    'lon': '13.4360114', 
    'tags': {
         'alt_name': 'Spreebalkon', 
         'name': 'Brommybalkon', 
         'tourism': 'viewpoint', 
         'wheelchair': 'yes'
    }
}

Upvotes: 3

Related Questions