Reputation: 1
xml file would be like this:
<employee>
<id>303</id>
<name>varma</name>
<age>20</age>
<salary>120000</salary>
<division>3</division>
</employee>
<employee>
<id>304</id>
<name>Cyril</name>
<age>20</age>
<salary>900000</salary>
<division>3</division>
</employee>
<employee>
<id>305</id>
<name>Yojith</name>
<age>20</age>
<salary>900000</salary>
<division>3</division>
</employee>
</employees>
wanted outputting csv or tabular format without using any libraries
I have tried using libraries but I'm unable to do it without any libraries, have an idea about doing it: 1. convert xml to dictionary 2. convert dictionary into csv
Upvotes: 0
Views: 7959
Reputation: 3415
I would recommend pandasread_xml()
and to_csv()
function, 3-liner:
Compare the documentation: to_csv, read_xml
import pandas as pd
df = pd.read_xml('employee.xml')
df.to_csv('out.csv', index=False)
Output -> (CSV-file):
id,name,age,salary,division
303,varma,20,120000,3
304,Cyril,20,900000,3
305,Yojith,20,900000,3
Upvotes: 4
Reputation: 372
I recommend just using libraries because they're usually very optimised. I'll talk about that later. For now, here's a way that utilises the xml.dom.minidom
module, which is a part of the Python standard library, so no additional libraries are required.
Edit: rewrote the last part using the standard CSV library instead of manually writing the file, as suggested by a comment. That makes for 2 Python built-in modules, not 1. The original code for the CSV writing will be at the end of the reply, if you're interested.
from xml.dom import minidom
from csv import DictWriter
# Step 1: Read and parse the XML file
# Write it as a string, or open the file and read it
xml_file = open('employees.xml', 'r')
xml_data = xml_file.read()
dom = minidom.parseString(xml_data)
employees = dom.getElementsByTagName('employee')
xml_file.close()
# Step 2: Extract the required information
data = []
for employee in employees:
emp_data = {}
for child in employee.childNodes:
if child.nodeType == minidom.Node.ELEMENT_NODE:
emp_data[child.tagName] = child.firstChild.data
data.append(emp_data)
# Step 3: Write the extracted information to a CSV file
with open('output.csv', 'w', newline = '') as csv_file:
fieldnames = ['id', 'name', 'age', 'salary', 'division']
writer = DictWriter(csv_file, fieldnames = fieldnames)
writer.writeheader()
for emp_data in data:
writer.writerow(emp_data)
Don't reinvent the wheel, just realign it.
— Anthony J. D'Angelo, I think
I recommend NOT using this code. You should really just use lxml
. It's extremely simple and easy to use and can handle complex XML structures with nested elements and attributes. Let me know how everything goes!
# Step 3: Write the extracted information to a CSV file
with open('output.csv', 'w') as f:
f.write('id,name,age,salary,division\n')
for emp_data in data:
f.write(f"{emp_data['id']},{emp_data['name']},{emp_data['age']},{emp_data['salary']},{emp_data['division']}\n")
Upvotes: 2