Reputation: 5769
I have a bit of script that I think is nearly there. I have worked out a crude way of writing it, but I can't work out how to get it to function as a for loop.
I am extracting data from an xml file that uses the following format:
<Trackpoint>
<Time>2012-01-17T11:44:35Z</Time>
<Position>
<LatitudeDegrees>51.920211518183351</LatitudeDegrees>
<LongitudeDegrees>26.706042898818851</LongitudeDegrees>
</Position>
<AltitudeMeters>-43.6026611328125</AltitudeMeters>
</Trackpoint>
<Trackpoint>
<Time>2012-01-17T11:45:21Z</Time>
<Position>
<LatitudeDegrees>51.920243117958307</LatitudeDegrees>
<LongitudeDegrees>26.706140967085958</LongitudeDegrees>
</Position>
<AltitudeMeters>-43.6026611328125</AltitudeMeters>
</Trackpoint>
I can use the following to get say the LatitudeDegrees:
from xml.dom.minidom import parse
doc = parse('/Users/name/Documents/GPS/gps.tcx')
lat = doc.getElementsByTagName("LatitudeDegrees")
time = doc.getElementsByTagName("Time")
trackpoint = doc.getElementsByTagName("Trackpoint")
for x in lat:
print(x.firstChild.data)
but I would like to get the Lat, Long and time in order.
I am guessing I need to use
for x in trackpoint
but the only way I can work out how to do that is as follows.
count = 0
n = len(trackpoint)
while count < n:
print(time[count].firstChild.data)
print(lat[count].firstChild.data)
print(lon[count].firstChild.data)
count += 1
anyone have any ideas? I think I am just missing something really simple!
Upvotes: 1
Views: 1777
Reputation: 88865
I usually found parsing xml using ElementTree more readable and easier e.g. you can read latitude in three lines
import xml.etree.ElementTree as etree
s="""<root>
<Trackpoint>
<Time>2012-01-17T11:44:35Z</Time>
<Position>
<LatitudeDegrees>51.920211518183351</LatitudeDegrees>
<LongitudeDegrees>26.706042898818851</LongitudeDegrees>
</Position>
<AltitudeMeters>-43.6026611328125</AltitudeMeters>
</Trackpoint>
<Trackpoint>
<Time>2012-01-17T11:45:21Z</Time>
<Position>
<LatitudeDegrees>51.920243117958307</LatitudeDegrees>
<LongitudeDegrees>26.706140967085958</LongitudeDegrees>
</Position>
<AltitudeMeters>-43.6026611328125</AltitudeMeters>
</Trackpoint>
</root>
"""
root = etree.fromstring(s)
for point in root:
print point.find('Position/LatitudeDegrees').text
so suppose you want to convert each point to a dict
varnames = [
('Position/LatitudeDegrees', 'lat'),
('Position/LongitudeDegrees', 'lon'),
('Time', 'time'),
('AltitudeMeters', 'alt')
]
points = []
for pointelem in etree.fromstring(s):
point = {}
for tag, varname in varnames:
point[varname] = pointelem.find(tag).text
points.append(point)
import pprint
pprint.pprint(points)
output:
[{'alt': '-43.6026611328125',
'lat': '51.920211518183351',
'lon': '26.706042898818851',
'time': '2012-01-17T11:44:35Z'},
{'alt': '-43.6026611328125',
'lat': '51.920243117958307',
'lon': '26.706140967085958',
'time': '2012-01-17T11:45:21Z'}]
Upvotes: 2
Reputation: 880987
Perhaps you are looking for zip:
import xml.dom.minidom as minidom
import os
doc = minidom.parse(os.path.expanduser('~/test/gps.tcx'))
latitudes = doc.getElementsByTagName("LatitudeDegrees")
longitudes = doc.getElementsByTagName("LongitudeDegrees")
time = doc.getElementsByTagName("Time")
trackpoint = doc.getElementsByTagName("Trackpoint")
for t,lat,lon in zip(time,latitudes,longitudes):
print(t.firstChild.data, lat.firstChild.data, lon.firstChild.data)
Upvotes: 0
Reputation: 16327
First find all the Trackpoint
elements and loop over them. Then inside the loop find the wanted childelements of each Trackpoint
element:
from xml.dom.minidom import parse
doc = parse('in.tcx')
trackpoints = doc.getElementsByTagName("Trackpoint")
result = []
elements = ('Time', 'LatitudeDegrees', 'LongitudeDegrees')
for tp in trackpoints:
obj = {}
for el in elements:
obj[el] = tp.getElementsByTagName(el)[0].firstChild.data
result.append(obj)
print(result)
Upvotes: 4