Reputation: 958
I'm using the lxml
library with Python 2.6 to extract data from an xml file. Within the document I have many <Employee>
tags. I iterate over each <Employee>
tag, create a new instance of my Employee
class and set its member variables with the values of the Employee
tag.
read_CA_tree = etree.parse(xml_tree, parser)
all_employees = []
for employee_tag in read_CA_tree.iter("Employee"):
employee = Employee(employee_tag)
all_employees.append(employee)
The <Employee>
tag may also have one or more <EmailAddress>
child tags like so:
<Employee ID="124" Name="Foo Bar" Title="Baz">
<EmailAddress ID="124" Address="[email protected]" />
</Employee>
My Employee object is instantiated via lxml's Element
calls get()
method
class Employee(object):
def __init__(self, employee_tag):
self.Employee_ID = employee_tag.get("EmployeeID")
self.First_Name = employee_tag.get("FirstName")
self.Email_Addresses = self._collect_email(read_CA_tree, "EmailAddress")
def _collect_emails(self,tree,tag):
known_addr = []
for i in tree.iter(tag):
known_addr.append(i)
return known_addr
For each Employee
tag, how can I collect the value(s) of Address
within the child <EmailAddress>
tag and add a list of email addresses to my Employee
class constructor?
Upvotes: 0
Views: 230
Reputation: 53623
Elements carry attributes as a dict
So, you can try:
def _collect_emails(self,tree,tag):
known_addr = []
email_addr = []
for i in tree.iter(tag):
known_addr.append(i)
email_addr.append(i.get('Address', '')
return known_addr
Upvotes: 2