Reputation: 131
I have a script that extracts data elements from XML files. I would like to run this on a directory(folder) of XMLs rather than a single one. This is what I have so far:
from xml.dom import minidom
from datetime import *
import os
import glob
filename = glob.glob("*.xml")
f = open(filename)
for xml in f:
print (xml)
xmldoc = minidom.parse(xml)
tcd = xmldoc.getElementsByTagName("QualityMeasureDocument")[0]
sport = activitiesElement.attributes["root"]
sportName = sport.value
print (sportName)
I am getting this error:
Traceback (most recent call last):
File "C:/Python34/Scripts/process.py", line 7, in <module>
f = open(filename)
TypeError: invalid file: ['CMS9v2.xml', 'country_data.xml', 'test.xml']
activitiesElement = tcd.getElementsByTagName("id")[0]
It would be nice to make this into a function as well.
Upvotes: 0
Views: 181
Reputation: 63162
Extract your current parsing to a function:
def parsefile (filename):
f = open(filename)
for xml in f:
print (xml)
xmldoc = minidom.parse(xml)
tcd = xmldoc.getElementsByTagName("QualityMeasureDocument")[0]
sport = activitiesElement.attributes["root"]
sportName = sport.value
print (sportName)
Call it:
for file in glob.glob(*.xml):
parsefile (file)
In general, all you need to change to make part of a python script a function is to indent it and add a line
def functionname (var1, var2... ):
Where var1 etc. are names defined earlier that it relies upon.
Upvotes: 2
Reputation: 4265
glob.glob
returns a list of filenames. You are treating a list as file. try it this way
filenames = glob.glob("*.xml")
for filename in filenames:
f = open(filename)
...
Upvotes: 3