Fetching nth child using BeautifulSoup Python3

Question

I am using Python3 Beautiful Soup to scrap a website. This is the XML data I am getting.


        MATERIALSET('R100100100')
        2018-05-11T04:28:47Z
        
        
        
            
                R100100100
                Z100
                       1.000
                29.06.2018
                5000000041

I just want to extract the data in d:BANFN. If I directly write soup.select('d:BANFN") it shows an error of 'nth_child_of_type'. I did go through some of the questions in Stackoverflow here are the links - Getting the nth element using BeautifulSoup and selecting second child in beautiful soup with soup.select? But nothing helps. Please help.

Rachit kapadia · Accepted Answer

In xml file there should be the starting tag for entry attribute then only you will be able to parse xml file:



    
        MATERIALSET('R100100100')
        2018-05-11T04:28:47Z
        
        
        
            
                R100100100
                Z100
                       1.000
                29.06.2018
                5000000041

from bs4 import BeautifulSoup
with open("sample.xml", "r") as f: # opening xml file
    content = f.read() # xml content stored in this variable and decode to utf-8

soup = BeautifulSoup(content, 'lxml') #parse content to BeautifulSoup Module

print("BANFN value : {}".format([ item.text for item in soup.find_all("d:banfn")][0])) #required result

#output:
BANFN value : 5000000041

Fetching nth child using BeautifulSoup Python3

Answers (1)

Related Questions