Reputation: 1156
I was struggling a bit to parse XML that I got using BeautifulSoup and although I've read the documents, I can't seem to get it to work properly with the way my XML is set up.
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xml" href="/static/atom.xsl"?>
<feed xmlns:s="server url here">
<!-- Feed elements>
<entry>
<!-- Other Elements -->
<content type="text/xml">
<s:dict>
<!-- Other keys. -->
<s:key name="sid">DATA I WANT HERE</s:key>
<!-- Other keys. -->
</s:dict>
<!-- Lots of other dicts here. -->
</content>
</entry>
<! -- Other entries -->
</feed>
My goal is to obtain the data from all the s:key
with attribute name
that has a value of sid
. (i.e. All s:key
have a name
, but only one per <entry>
is of type sid
.
How do I print out all the text between the relevant s:key
that is of type sid
in my data?
What I've tried is:
print(tree.findAll('key', {'name'}))
as well as:
for elem in tree.feed.entry.content.dict.key:
print(elem)
but obviously these are flawed and do not work properly as I want them to.
How do I accomplish what I would like to obtain?
Upvotes: 1
Views: 1275
Reputation: 33384
Try the below code:
soup = bs4.BeautifulSoup(html_doc, 'lxml')
elements = soup.findAll("s:key", {"name" : "sid"})
for lele in elements:
print(lele.text)
Output :-
DATA I WANT HERE
Upvotes: 3