Using BeautifulSoup to parse XML with tags that contain colon

Question

I was struggling a bit to parse XML that I got using BeautifulSoup and although I've read the documents, I can't seem to get it to work properly with the way my XML is set up.




    
        
            
                
                DATA I WANT HERE

My goal is to obtain the data from all the s:key with attribute name that has a value of sid. (i.e. All s:key have a name, but only one per is of type sid.

How do I print out all the text between the relevant s:key that is of type sid in my data?

What I've tried is:

print(tree.findAll('key', {'name'}))

as well as:

for elem in tree.feed.entry.content.dict.key:
    print(elem)

but obviously these are flawed and do not work properly as I want them to.

How do I accomplish what I would like to obtain?

KunduK · Accepted Answer

Try the below code:

soup = bs4.BeautifulSoup(html_doc, 'lxml')
elements = soup.findAll("s:key", {"name" : "sid"})
for lele in elements:
    print(lele.text)

Output :-

DATA I WANT HERE

Using BeautifulSoup to parse XML with tags that contain colon

Answers (1)

Related Questions