Reputation:
I have an xml file- https://github.com/schogini/jVoiD/blob/master/Modules/jVoidCustomers/src/main/webapp/WEB-INF/spring/webcontext/DispatcherServlet-context.xml
I am using the following code to parse a xml file-
from lxml import etree
xslt_root = etree.parse("/Users/cbuser1/CodeBlueFabricator/src/poc/PythonParser/mvc-config.xml")
print(xslt_root)
I get the result of my program as-
<lxml.etree._ElementTree object at 0x10e95bcc8>
Now I need to loop through this object and get the xpath of every element in it. (every single element inside the xml file). Any ideas?
Upvotes: 1
Views: 797
Reputation:
Actually I was able to find a solution. I converted my XML file to JSON using this-
import json
import xmltodict
with open("INSERT XML FILE LOCATION HERE", 'r') as f:
xmlInString = f.read()
print("The xml file read-")
print(xmlInString)
JsonedXML = json.dumps(xmltodict.parse(xmlString), indent=4)
print("\nJSON format of read xml file-")
print(JsonedXML)
with open("myJson.json", 'w') as f:
f.write(JsonedXML)
And then I went through the json and found all the innermost nodes and saved their key and value in a txt file using this-
import json
data = json.load(open('GIVE LOCATION OF THE CONVERTED JSON HERE'))
token_key_value_dictionary=[]
only_tokens_dictionary=[]
uniqueKey ='xml'
def recursive_json_parser(start_point_value,uniqueKey,start_point_key=''):
if start_point_key !='':
uniqueKey += '.'+start_point_key
if type(start_point_value) is str or type(start_point_value) is unicode:
token_key_value_dictionary.append([str(uniqueKey),str(start_point_value)])
only_tokens_dictionary.append(str(start_point_value))
uniqueKey =''
elif type(start_point_value) is list:
for i in start_point_value:
recursive_json_parser(i,uniqueKey)
else:
for key,value in start_point_value.items():
recursive_json_parser(value,uniqueKey,key)
for key,value in data.items():
print (len(value))
recursive_json_parser(value,uniqueKey,key)
f = open('tokens.txt','w')
for row in only_tokens_dictionary:
print (row)
if row!='':
f.write(row+'\n')
f.close()
In the 2nd program, I went through the json-consisting of lists, and dictionaries to reach in to the innermost nodes consisting only of a key and a value and no more list or dictionary inside it.
Upvotes: 1