Tautvy Da
Tautvy Da

Reputation: 85

Converting pandas dataframe to XML

I know this question has been asked before and my last was put on hold, so now I'm specifying it detailed. I have a CSV file of population information, I read it to pandas and now have to transform it to XML, for example like this
<?xml version="1.0" encoding="utf-8"?> <populationdata> <municipality> <name> Akaa </name> <year> 2014 </year> <total> 17052 ......

This is the reading part of my code:
import pandas as pd pop = pd.read_csv(r'''directory\population.csv''', delimiter=";")

Tried doing it like in mentioned before in the link here with function and cycle: How do convert a pandas/dataframe to XML?. Haven't succeeded, any other recommendations maybe?

This is an example of my dataframe:
Alahärmä 2014 0 0.1 0.2 0 Alajärvi 2014 10171 5102 5069 1 Alastaro 2014 0 0 0 2 Alavieska 2014 2687 1400 1287 3 Alavus 2014 12103 6102 6001 4 Anjalankoski 2014 0 0 0

Fairly new to python, so any help is apreciated.

Upvotes: 2

Views: 6513

Answers (1)

Paula Livingstone
Paula Livingstone

Reputation: 1215

The question you have linked to actually has a great answer to your question but I guess you’re having difficulty transposing your data into that solution so Ive done it below for you.

Ok your level of detail is a bit sketchy. If your specific situation differs slightly then you'll need to tweak my answer but heres something that works for me:

First off assuming you have a text file as follows :

0       Alahärmä  2014      0   0.1   0.2
1      Alajärvi  2014  10171  5102  5069
2      Alastaro  2014      0     0     0
3     Alavieska  2014   2687  1400  1287
4        Alavus  2014  12103  6102  6001
5  Anjalankoski  2014      0     0     0

Moving on to creating the python script, we first import that text file using the following line:

pop = pd.read_csv(r'directory\population.csv', delimiter=r"\s+", names=['cityname', 'year', 'total', 'male', 'females'])

This brings in the text file as a dataframe and gives the new dataframe the correct column headers.

Then taking the data from the question you linked to, we add the following to our python script:

def func(row):
    xml = ['<item>']
    for field in row.index:
        xml.append('  <field name="{0}">{1}</field>'.format(field, row[field]))
    xml.append('</item>')
    return '\n'.join(xml)

print('\n'.join(pop.apply(func, axis=1)))

Now we put it all together and we get the below:

import pandas as pd
pop = pd.read_csv(r'directory\population.csv', delimiter=r"\s+", names=['cityname', 'year', 'total', 'male', 'females'])

def func(row):
    xml = ['<item>']
    for field in row.index:
        xml.append('  <field name="{0}">{1}</field>'.format(field, row[field]))
    xml.append('</item>')
    return '\n'.join(xml)

print('\n'.join(pop.apply(func, axis=1)))

When we run the above file we get the following output:

<item>
  <field name="cityname">Alahärmä</field>
  <field name="year">2014</field>
  <field name="total">0</field>
  <field name="male">0.1</field>
  <field name="females">0.2</field>
</item>
<item>
  <field name="cityname">Alajärvi</field>
  <field name="year">2014</field>
  <field name="total">10171</field>
  <field name="male">5102.0</field>
  <field name="females">5069.0</field>
</item>
<item>
  <field name="cityname">Alastaro</field>
  <field name="year">2014</field>
  <field name="total">0</field>
  <field name="male">0.0</field>
  <field name="females">0.0</field>
</item>
<item>
  <field name="cityname">Alavieska</field>
  <field name="year">2014</field>
  <field name="total">2687</field>
  <field name="male">1400.0</field>
  <field name="females">1287.0</field>
</item>
<item>
  <field name="cityname">Alavus</field>
  <field name="year">2014</field>
  <field name="total">12103</field>
  <field name="male">6102.0</field>
  <field name="females">6001.0</field>
</item>
<item>
  <field name="cityname">Anjalankoski</field>
  <field name="year">2014</field>
  <field name="total">0</field>
  <field name="male">0.0</field>
  <field name="females">0.0</field>
</item>

Upvotes: 1

Related Questions