Road Name
Road Name

Reputation: 129

Get the content of div within div using BeautifulSoup?

I want to get the content of a div with a class of "gt-read" and within the div have another div that has a different class. Below is the script code snippet:

Scripts :

data = """
    <div class='gt-read'>
        <!-- no need -->
        <!-- some no need -->

        <b>Bold text</b> - some text here <br/>
        lorem ipsum here <br/>
        <strong> Author Name</strong>

        <div class='some-class'>
            <script>
                #...
                Js script here
                #...
            </script>
        </div>
    </div>
    """
soup = BeautifulSoup(data, 'lxml')
get_class = soup.find("div", {"class" : "detail_text"})
print 'notices', notices.get_text()
print 'notices', notices

and I want results like this:

<b>Bold text</b> - some text here <br/>
lorem ipsum here <br/>
<strong> Author Name</strong>

Kindly help.

Upvotes: 0

Views: 1377

Answers (1)

Martin Evans
Martin Evans

Reputation: 46779

The following should display what you need:

from bs4 import BeautifulSoup, Comment  

data = """
    <div class='gt-read'>
        <!-- no need -->
        <!-- some no need -->

        <b>Bold text</b> - some text here <br/>
        lorem ipsum here <br/>
        <strong> Author Name</strong>

        <div class='some-class'>
            <script>
                #...
                Js script here
                #...
            </script>
        </div>
    </div>
    """
soup = BeautifulSoup(data, 'lxml')
get_class = soup.find("div", {"class" : "gt-read"})
comments = get_class.find_all(text=lambda text:isinstance(text, Comment))
[comment.extract() for comment in comments]

get_class.find("div").extract()
text = get_class.encode_contents().strip()

print text

Giving you the following output:

<b>Bold text</b> - some text here <br/>
        lorem ipsum here <br/>
<strong> Author Name</strong>   

This gets the gt-read class, extracts all comments and the div tag, and returns the remaining markup.

Upvotes: 2

Related Questions