Reputation: 173
I have xml file like this :
<?xml version="1.0" encoding="UTF-8"?>
<Main xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:gml="http://www.opengis.net/gml/3.2" xmlns="http://cnig.gouv.fr/pcrs" gml:id="PlanCorpsRueSimplifie.1" version="2.0">
<gml:boundedBy>
</gml:boundedBy>
<featureMember>
<EmpriseEchangePCRS gml:id="EmpriseEchangePCRS.12189894">
<datePublication>2020-05-13</datePublication>
<type>Cellules</type>
<geometrie>
<gml:MultiSurface gml:id="EmpriseEchangePCRS.12189894-0" srsName="EPSG:3944" srsDimension="3">
<gml:surfaceMember>
<gml:Surface gml:id="EmpriseEchangePCRS.12189894-1">
<gml:patches>
</gml:patches>
</gml:Surface>
I wouldike to transform this file into json file. I tried this but I have always the same error :
import xmltodict
import xml.etree.ElementTree as ET
root = ET.fromstring(open('JeuxTestv2.gml').read())
print(xmltodict.parse(root)['Main'])
ERROR :
Traceback (most recent call last):
File "C:\Users\xmltodict.py", line 6, in <module>
print(xmltodict.parse(root)['Main'])
File "C:\Users\xmltodict.py", line 327, in parse
parser.Parse(xml_input, True)
TypeError: a bytes-like object is required, not 'xml.etree.ElementTree.Element'
Upvotes: 5
Views: 10547
Reputation: 52
I am using Python 3.7.6
When I tried, ET.fromstring() will parse the XML that is already represented in string format.
import os
import xml.etree.ElementTree as et
xml_doc_path = os.path.abspath(r"C:\dir1\path\to\file\example.xml")
root = et.fromstring(xml_doc_path)
print(root)
this example will show the following ERROR
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 2
I used ET.tostring() to generate a string representation of the XML data, which can be used as a valid argument for xmltodict.parse(). Click here for the ET.tostring() documentation.
The below code will parse an XML file and also generates the JSON file. I used my own XML example. Make sure all the XML tags are closed properly.
XML:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<element1 attribute1 = 'first attribute'>
</element1>
<element2 attribute1 = 'second attribute'>
some data
</element2>
</root>
PYTHON CODE:
import os
import xmltodict
import xml.etree.ElementTree as et
import json
xml_doc_path = os.path.abspath(r"C:\directory\path\to\file\example.xml")
xml_tree = et.parse(xml_doc_path)
root = xml_tree.getroot()
#set encoding to and method proper
to_string = et.tostring(root, encoding='UTF-8', method='xml')
xml_to_dict = xmltodict.parse(to_string)
with open("json_data.json", "w",) as json_file:
json.dump(xml_to_dict, json_file, indent = 2)
OUTPUT: The above code will create the following JSON file:
{
"root": {
"element1": {
"@attribute1": "first attribute"
},
"element2": {
"@attribute1": "second attribute",
"#text": "some data"
}
}
}
Upvotes: 4