parsing XML to CSV using python "None type error"

Question

Trying to convert XML to CSV. I am new to python parsing.

Data sample ("Dummy data")

I tried this code

import csv
import xml.etree.ElementTree as Xet
import pandas as pd

cols = ["firstName", "lastName", "age", "sex", "retired", "dependants", "marital_status", "salary", "pension", "company", "commute_distance", "address_postcode"]
rows = []
  
# Parsing the XML file
xmlparse = Xet.parse('/content/drive/MyDrive/DATAtask1/user_data.xml')
root = xmlparse.getroot()
for user in root:
    firstName = user.find("firstName").text
    lastName = user.find("lastName").text
    age = user.find("age").text
    sex = user.find("sex").text
    retired = user.find("retired").text
    dependants = user.find("dependants").text
    marital_status = user.find("marital_status").text
    salary = user.find("salary").text
    pension = user.find("pension").text
    company = user.find("company").text
    commute_distance = user.find("commute_distance").text
    address_postcode = user.find("address_postcode").text
  
    rows.append({"firstName": firstName,
                 "lastName": lastName,
                 "age": age,
                 "sex": sex,
                 "retired": retired,
                 "dependants": dependants,
                 "marital_status": marital_status,
                 "pension": pension,
                 "salary": salary,
                 "company": company,
                 "commute_distance": commute_distance,
                 "address_postcode": address_postcode})
  
df = pd.DataFrame(rows, columns=cols)
  
# Writing dataframe to csv
df.to_csv('/content/drive/MyDrive/DATAtask1/XMLtoCSV.csv')

Getting this error

AttributeError Traceback (most recent call last)

 in ()
      7 root = xmlparse.getroot()
      8 for user in root:
----> 9     firstName = user.find("firstName").text
     10     lastName = user.find("lastName").text
     11     age = user.find("age").text

AttributeError: 'NoneType' object has no attribute 'text'

3dSpatialUser · Accepted Answer

The tag has an attribute with firstName. Therefore you should use:

user.attrib['firstName']

If you check: user.attrib, it will return a dictionary (this is not true, it returns lxml.etree._Attrib which can be converted to a dictionary using (dict(user.attrib)). That will give you an opportunity to make your code easier since you can just use the dictionary like a normal python dictionary.

For example you can create a list and append all dictionaries to the list. At the end it is possible to convert a list of dictionaries to a pandas dataframe:

d1 = {'name': 'john', 'age': 19}
d2 = {'name': 'Steve', 'age': 16}
# A dictionary with an extra key:
d3 = {'name': 'Jim', 'age': 25, 'additional': 'something'}
df = pd.DataFrame([d1, d2, d3])

    name  age additional
0   john   19        NaN
1  Steve   16        NaN
2    Jim   25  something

parsing XML to CSV using python "None type error"

Answers (1)

Related Questions

parsing XML to CSV using python &quot;None type error&quot;

Answers (1)

Related Questions

parsing XML to CSV using python "None type error"