Reputation: 499
Have a response from backend api which is giving me the below response.I want to extract out the pid data "1664953412.79414"
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xml" href="/static/atom.xsl"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:s="http://dev.splunk.com/ns/rest" xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/" shp_request_proxied_from="3DB91F64-892E-4DB2-9271-C5CB5CAFBFBB">
<title>jobs</title>
<updated>2022-10-05T10:48:30-07:00</updated>
<author>
<name>Splunk</name>
</author>
<opensearch:totalResults>1</opensearch:totalResults>
<entry>
<published>2022-10-05T00:03:34.000-07:00</published>
<author>
<name>abc-pull</name>
</author>
<content type="text/xml">
<s:dict>
<s:key name="pid">1664953412.79414</s:key>
</s:dict>
</content>
</entry>
</feed>
I have tried various approaches but I am not able to extract out the data.
from xml.dom import minidom
pid = minidom.parseString(response.text).getElementsByTagName('pid')[0].childNodes[0].nodeValue
ThenI tried like this
import xml.etree.ElementTree as ET
root = ET.fromstring(response.text)
print(root.tag)
print(root.find('entry'))
But not getting entry tag data also properly Can someone please help here. Note :- I cannot use xmltodict as thats not available in my enterprise packages
Upvotes: 0
Views: 89
Reputation: 16187
Y ou can use BeautifulSoup
to pull the text node value of tag s:key
along with attr name="pid"
because it's super powerful to parse html and xml
DOM contents.
xml_doc = '''
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xml" href="/static/atom.xsl"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:s="http://dev.splunk.com/ns/rest" xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/" shp_request_proxied_from="3DB91F64-892E-4DB2-9271-C5CB5CAFBFBB">
<title>jobs</title>
<updated>2022-10-05T10:48:30-07:00</updated>
<author>
<name>Splunk</name>
</author>
<opensearch:totalResults>1</opensearch:totalResults>
<entry>
<published>2022-10-05T00:03:34.000-07:00</published>
<author>
<name>abc-pull</name>
</author>
<content type="text/xml">
<s:dict>
<s:key name="pid">1664953412.79414</s:key>
</s:dict>
</content>
</entry>
</feed>
'''
from bs4 import BeautifulSoup
pid = BeautifulSoup(xml_doc, 'lxml').select_one('s\:key[name="pid"]').text
print(pid)
Output:
1664953412.79414
Upvotes: 1