Reputation: 49
I am attempting to parse an xml file which I have accomplished and pass the results into an array which will be used later on. The xml is opened read and parsed where I am picking out 3 elements (channel, start and title). As shown in code below, the start is date and time. I am able to split date and time and store in date. As the code loops thru each xml entry I would like to pick out the channel, start and title and store to a multidimensional array. I have done this in Brightscript but can't understand the array or list structure of Python. Once I have all entries in the array or list, I will need to parse that array pulling out all titles and dates with the same date. Can somebody guide me thru this?
xmldoc=minidom.parse (xmldoc)
programmes= xmldoc.getElementsByTagName("programme")
def getNodeText(node):
nodelist = node.childNodes
result = []
for node in nodelist:
if node.nodeType == node.TEXT_NODE:
result.append(node.data)
return ''.join(result)
title = xmldoc.getElementsByTagName("title")[0]
#print("Node Name : %s" % title.nodeName)
#print("Node Value : %s \n" % getNodeText(title))
programmes = xmldoc.getElementsByTagName("programme")
for programme in programmes:
cid = programme.getAttribute("channel")
starts=programme.getAttribute("start")
cutdate=starts[0:15]
year= int(cutdate[0:4])
month= int(cutdate[5:6])
day= int(cutdate[7:8])
hour= int(cutdate[9:10])
minute= int(cutdate[11:12])
sec= int(cutdate[13:14])
date=datetime(year, month, day,hour, minute, sec)
title = programme.getElementsByTagName("title")[0]
print("id:%s, title:%s, starts:%s" %
(cid, getNodeText(title), starts))
print (date)
Upvotes: 0
Views: 603
Reputation: 881705
Python normally refers to arrays as list
s and it looks like what you want is a list of lists (there's an array
module and the whole numpy
extension with its own arrays, but it doesn't look like you want that:-).
So start the desired list as empty:
results = []
and where you now just print things, append them to the list:
results.append([cid, getNodeText(title), date])
(or whatever -- your indentation is so rambling it would cause tons of syntax errors in Python and confuses me about what exactly you want:-).
Now for the part
I will need to parse that array pulling out all titles and dates with the same date
just sort the results by date:
import operator
results.sort(key=operator.itemgetter(2))
then group by that:
import itertools
for date, items in itertools.groupby(results, operator.itemgetter(2)):
print(date,[it[1] for it in items])
or whatever else you want to do with this grouping.
You could improve this style in many ways but this does appear to give you the key functionality you're asking for.
Upvotes: 1