Reputation: 691
Given the below XML named test.xml in my working directory:
<workbook>
<style>
<style-rule element='worksheet'>
<format attr='font-family' value='Tahoma' />
<format attr='font-size' value='15' />
<format attr='font-weight' value='bold' />
<format attr='color' value='#ffbe7d' />
</style-rule>
</style>
</workbook>
I am trying to return the element within style-rule and, ultimately, each of the format elements as well. I have tried the below python code and None is returned:
from bs4 import BeautifulSoup
import os
with open(os.getcwd()+'//test.xml') as xmlfile:
soup = BeautifulSoup(xmlfile, 'html.parser')
print(soup.style.find('style-rule'))
I know to use a find command due to the presence of a hyphen in the element name, and have been successful with this technique in other hyphenated parts of an xml file. For some reason that I'm unaware of, though, this instance is giving me issues.
Upvotes: 2
Views: 365
Reputation: 764
The problem isn't because of the hyphen , if you try to print the style tag's innerText you will get the style-rule in a string type for some reason.
My guessing is the style tags comes usually with content that is considered as a string in bs4, but here you are using it as an html container.
A workaround :
from bs4 import BeautifulSoup
import os
soup = BeautifulSoup(text)
soup = BeautifulSoup(soup.find('style').text)
for format in soup.select('style-rule > format'):
print(format)
Demo : Here
Upvotes: 1