regex findall in beautifulsoup -python 3

Question

I need to get the name and value and context ref for all the fields under the tag ix:nonfraction which looks like this:

238,011.

with the output needed as :

TangibleFixedAssets, FY1.end, 238,011

the string that the regex will have to search through contains many of these tags so would there be a way of keeping all the 3 outputs concatenated (or within the same index of the list)?

宏杰李 · Accepted Answer

import bs4
html = '''238,011'''

soup = bs4.BeautifulSoup(html, 'lxml')

ixs = soup.find_all('ix:nonfraction')
for ix in ixs:
    name = ix['name'].split(':')[-1]
    contextref = ix['contextref']
    text = ix.text
    output = [name, contextref, text]
    print(output)

out:

['TangibleFixedAssets', 'FY1.END', '238,011']

regex findall in beautifulsoup -python 3

Answers (1)

Related Questions