Johannes
Johannes

Reputation: 25

Save (print) output as a xml file doesnt work

I wrote a little program that reads in a csv file row by row with a function. The output structure is then saved in a variable. I want to open an xml file then and pass my output with the variable. Somehow when I open the saved file not the entire output is saved.

import pandas as pd
df = pd.read_csv('mytests.csv', sep=',')

def csv_to_xml(row):
    return """  <Test Testname="%s">
        <Health_Feat>%s</Health_Feat>
        <Result>%s</Result>
  </Test>""" % (row.test_name, row.health_feat, row.result)
for index, row in df.iterrows():
   xml_1 = (csv_to_xml(row))
   print(xml_1)

f = open("new_xml_1.xml","w+")
f.write(xml_1)    
f.close() 

I get this output when I print out xml_1

  <Test Testname="test_1">
        <Health_Feat>20</Health_Feat>
        <Result>1</Result>
  </Test>
  <Test Testname="test_2">
        <Health_Feat>23</Health_Feat>
        <Result>1</Result>
  </Test>
  <Test Testname="test_3">
        <Health_Feat>24</Health_Feat>
        <Result>0</Result>
  </Test>
  <Test Testname="test_3">
        <Health_Feat>30</Health_Feat>
        <Result>1</Result>
  </Test>
  <Test Testname="test_4">
        <Health_Feat>12</Health_Feat>
        <Result>1</Result>
  </Test>
  <Test Testname="test_5">
        <Health_Feat>45</Health_Feat>
        <Result>0</Result>
  </Test>
  <Test Testname="test_6">
        <Health_Feat>34</Health_Feat>
        <Result>1</Result>
  </Test>
  <Test Testname="test_7">
        <Health_Feat>78</Health_Feat>
        <Result>1</Result>
  </Test>
  <Test Testname="test_8">
        <Health_Feat>23</Health_Feat>
        <Result>1</Result>
  </Test>
  <Test Testname="test_9">
        <Health_Feat>12</Health_Feat>
        <Result>1</Result>
  </Test>
  <Test Testname="test_10">
        <Health_Feat>12</Health_Feat>
        <Result>2</Result>
  </Test>

But when I open the "new_xml_1.xml" File I only get

  <Test Testname="test_10">
        <Health_Feat>12</Health_Feat>
        <Result>2</Result>
  </Test>

I dont really know why my program acts so strange. I think its has something to do with looping through to rows of the csv

Thanks for any help. I am new to Python and programming so I want to gain a bit experience.

Upvotes: 1

Views: 474

Answers (2)

jboockmann
jboockmann

Reputation: 1025

The variable xml_1 contains the last row of your data after the execution of the loop. You must either move the code that writes the variable to a file into the loop, or capture each row in a list and write the list to a file afterwards. The code snippet below implements the former approach:

import pandas as pd
df = pd.read_csv('mytests.csv', sep=',')

def csv_to_xml(row):
    return """  <Test Testname="%s">
        <Health_Feat>%s</Health_Feat>
        <Result>%s</Result>
  </Test>""" % (row.test_name, row.health_feat, row.result)

with open("new_xml_1.xml","w+") as f:
   for index, row in df.iterrows():
      xml_1 = (csv_to_xml(row))
      print(xml_1)
      f.write(xml_1)

Upvotes: 1

Joona
Joona

Reputation: 105

It's only saving the last output because you are saving outside variable xml_1 after the for loop, where it only has the last output. Open the file before the loop, and write inside the loop, like you print inside the loop.

f = open("new_xml_1.xml","w+")

for index, row in df.iterrows():
   xml_1 = (csv_to_xml(row))
   print(xml_1)
   f.write(xml_1)

f.close() 

Upvotes: 2

Related Questions