Reputation: 85
I know this question has been asked before and my last was put on hold, so now I'm specifying it detailed.
I have a CSV file of population information, I read it to pandas and now have to transform it to XML, for example like this
<?xml version="1.0" encoding="utf-8"?>
<populationdata>
<municipality>
<name>
Akaa
</name>
<year>
2014
</year>
<total>
17052
......
This is the reading part of my code:
import pandas as pd
pop = pd.read_csv(r'''directory\population.csv''', delimiter=";")
Tried doing it like in mentioned before in the link here with function and cycle: How do convert a pandas/dataframe to XML?. Haven't succeeded, any other recommendations maybe?
This is an example of my dataframe:
Alahärmä 2014 0 0.1 0.2
0 Alajärvi 2014 10171 5102 5069
1 Alastaro 2014 0 0 0
2 Alavieska 2014 2687 1400 1287
3 Alavus 2014 12103 6102 6001
4 Anjalankoski 2014 0 0 0
Fairly new to python, so any help is apreciated.
Upvotes: 2
Views: 6513
Reputation: 1215
The question you have linked to actually has a great answer to your question but I guess you’re having difficulty transposing your data into that solution so Ive done it below for you.
Ok your level of detail is a bit sketchy. If your specific situation differs slightly then you'll need to tweak my answer but heres something that works for me:
First off assuming you have a text file as follows :
0 Alahärmä 2014 0 0.1 0.2
1 Alajärvi 2014 10171 5102 5069
2 Alastaro 2014 0 0 0
3 Alavieska 2014 2687 1400 1287
4 Alavus 2014 12103 6102 6001
5 Anjalankoski 2014 0 0 0
Moving on to creating the python script, we first import that text file using the following line:
pop = pd.read_csv(r'directory\population.csv', delimiter=r"\s+", names=['cityname', 'year', 'total', 'male', 'females'])
This brings in the text file as a dataframe and gives the new dataframe the correct column headers.
Then taking the data from the question you linked to, we add the following to our python script:
def func(row):
xml = ['<item>']
for field in row.index:
xml.append(' <field name="{0}">{1}</field>'.format(field, row[field]))
xml.append('</item>')
return '\n'.join(xml)
print('\n'.join(pop.apply(func, axis=1)))
Now we put it all together and we get the below:
import pandas as pd
pop = pd.read_csv(r'directory\population.csv', delimiter=r"\s+", names=['cityname', 'year', 'total', 'male', 'females'])
def func(row):
xml = ['<item>']
for field in row.index:
xml.append(' <field name="{0}">{1}</field>'.format(field, row[field]))
xml.append('</item>')
return '\n'.join(xml)
print('\n'.join(pop.apply(func, axis=1)))
When we run the above file we get the following output:
<item>
<field name="cityname">Alahärmä</field>
<field name="year">2014</field>
<field name="total">0</field>
<field name="male">0.1</field>
<field name="females">0.2</field>
</item>
<item>
<field name="cityname">Alajärvi</field>
<field name="year">2014</field>
<field name="total">10171</field>
<field name="male">5102.0</field>
<field name="females">5069.0</field>
</item>
<item>
<field name="cityname">Alastaro</field>
<field name="year">2014</field>
<field name="total">0</field>
<field name="male">0.0</field>
<field name="females">0.0</field>
</item>
<item>
<field name="cityname">Alavieska</field>
<field name="year">2014</field>
<field name="total">2687</field>
<field name="male">1400.0</field>
<field name="females">1287.0</field>
</item>
<item>
<field name="cityname">Alavus</field>
<field name="year">2014</field>
<field name="total">12103</field>
<field name="male">6102.0</field>
<field name="females">6001.0</field>
</item>
<item>
<field name="cityname">Anjalankoski</field>
<field name="year">2014</field>
<field name="total">0</field>
<field name="male">0.0</field>
<field name="females">0.0</field>
</item>
Upvotes: 1