taga
taga

Reputation: 3895

How to get meta tag value with BeautifulSoup soup.select

I want to extract date from one html tag. I'm using Python and Beautiful Soup.

<meta name="Email" content="[email protected]">
<meta name="Date" content="2021-04-28T20:35:00+02:00">
<meta name="title" content="Tris is tite">

I want to extract only date, so this should be result: 2021-04-28T20:35:00+02:00

I know that I can do it like this:

tag = "meta['Date']"
date = soup.select(tag)
date = date['content']

But is is possible to do that only with one css selector, only with tag value? For example, something like this?

tag = "meta['Date']['content']" # or something like this?
date = soup.select(tag)
print(date)
2021-04-28T20:35:00+02:00

PS I have to use soup.select, soup.find(...) and soup.select_one does not work for me. So only soup.select works!

Upvotes: 1

Views: 239

Answers (2)

Mahrez BenHamad
Mahrez BenHamad

Reputation: 2086

You can simply use:

>>> from bs4 import BeautifulSoup
>>> content = """<meta name="Date" content="2021-04-28T20:35:00+02:00"> """
>>> soup = BeautifulSoup(content, 'html.parser')
>>> soup.find("meta", {"name": "Date"}).attrs['content']
>>> '2021-04-28T20:35:00+02:00'

If you want to extract all 'meta' tags and display the Date properties using 'select':

>>> for item in soup.select("meta"):
...     print(item.attrs.get('content'))

Upvotes: 0

Jack Fleeting
Jack Fleeting

Reputation: 24940

You are almost there. It seems you are confusing the attribute names. Try:

tag = "meta[content]" 
date = soup.select_one(tag)
print(date.get('content'))

Output:

2021-04-28T20:35:00+02:00

Edit: Change the tag line to:

tag = "meta[content][name='Date']" 

Upvotes: 1

Related Questions