js352
js352

Reputation: 374

Getting meta property with beautifulsoup

I am trying to extract the property "og" from opengraph from a website. What I want is to have all the tags that start with "og" of the document in a list.

What I've tried is:

soup.find_all("meta", property="og:")

and

soup.find_all("meta", property="og")

But it does not find anything unless I specify the complete tag.

A few examples are:

 <meta content="https://www.youtube.com/embed/Rv9hn4IGofM" property="og:video:url"/>,
 <meta content="https://www.youtube.com/embed/Rv9hn4IGofM" property="og:video:secure_url"/>,
 <meta content="text/html" property="og:video:type"/>,
 <meta content="1280" property="og:video:width"/>,
 <meta content="720" property="og:video:height"/>

Expected output would be:

l = ["og:video:url", "og:video:secure_url", "og:video:type", "og:video:width", "og:video:height"]

How can I do this?

Thank you

Upvotes: 1

Views: 475

Answers (3)

uingtea
uingtea

Reputation: 6554

use CSS selector meta[property]

metas = soup.select('meta[property]')
propValue = [v['property'] for v in metas]
print(propValue)

Upvotes: 2

MendelG
MendelG

Reputation: 20098

You can check if og exist in property as follows:

...
soup = BeautifulSoup(html, "html.parser")

og_elements = [
    tag["property"] for tag in soup.find_all("meta", property=lambda t: "og" in t)
]

print(og_elements)

Upvotes: 1

baduker
baduker

Reputation: 20050

Is this what you want?

from bs4 import BeautifulSoup

sample = """
<html>
<body>
<meta content="https://www.youtube.com/embed/Rv9hn4IGofM" property="og:video:url"/>,
<meta content="https://www.youtube.com/embed/Rv9hn4IGofM" property="og:video:secure_url"/>,
<meta content="text/html" property="og:video:type"/>,
<meta content="1280" property="og:video:width"/>,
<meta content="720" property="og:video:height"/>
</body>
</html>
"""

print([m["property"] for m in BeautifulSoup(sample, "html.parser").find_all("meta")])

Output:

['og:video:url', 'og:video:secure_url', 'og:video:type', 'og:video:width', 'og:video:height']

Upvotes: 1

Related Questions