user3969578
user3969578

Reputation:

Finding first tag in HTML file with BeautifulSoup

I have a set of HTML files which I want to pull the first tag in each file. As the files don’t have a specific tag which will always be the first in the file, I’m not sure how to do this.

As an example, for the following snippet, the first tag would be <html>.

<html>
 <head>
    <title>
     insert title here
    </title>
 </head>
</html>

Any way to accomplish this with BeautifulSoup (or possibly another tool)? Thanks in advance :)

Upvotes: 7

Views: 13534

Answers (1)

alecxe
alecxe

Reputation: 473763

You can use BeautifulSoup in this case, just issue find() on a BeautifulSoup object - it would find the first element in the tree. .name would give you the tag name:

from bs4 import BeautifulSoup

data = """
<html>
 <head>
    <title>
     insert title here
    </title>
 </head>
</html>
"""

soup = BeautifulSoup(data, "html.parser")
print(soup.find().name)

Upvotes: 7

Related Questions