Eswar
Eswar

Reputation: 1212

Dump xml data into a cell in csv file using python

I am having an XML data which also contains HTML data. I'm trying to dump this XML data to one cell in a csv file which also contains other columns. Right now, it is splitting itself and coming in different(adjacent) cells. Therefore reading the csv using pandas throws an error

Error tokenizing data. C error: Expected 94 fields in line 3, saw 221

I also looked into a similar scenario. But it didn't help because it was from a database. Therefore the workaround functionalities will be different.

I am not looking to parse the XML data. I just want to save the entire XML data into one cell in a csv file.

Moreover, I cannot share the data snapshot for confidentiality reasons but I hope the issue is conveyed.

Any help is appreciated.

Upvotes: 2

Views: 247

Answers (2)

Token Joe
Token Joe

Reputation: 177

you can use built in csv package, try wrapping the xml as a string inside of a list:

import csv

xml = ["""<?xml version="1.0"?>
<catalog>
   <book id="bk101">
      <author>Gambardella, Matthew</author>
      <title>XML Developer's Guide</title>
      <genre>Computer</genre>
      <price>44.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications 
      with XML.</description>
   </book>
</catalog>"""]

with open("test.csv", "w", encoding="utf8") as out_file:
    writer = csv.writer(out_file)
    writer.writerow(xml)

You should then be able to read it with pandas.

Upvotes: 2

Dariusz Krynicki
Dariusz Krynicki

Reputation: 2718

import pandas as pd


with open('note.xml', 'r') as f:
    data = f.read()

df = pd.DataFrame(data = {'xml_file': [data]})

df.to_csv('xml_as_csv.csv')

Upvotes: 1

Related Questions