Reputation: 3895
I want to extract date from one html tag. I'm using Python and Beautiful Soup.
<meta name="Email" content="[email protected]">
<meta name="Date" content="2021-04-28T20:35:00+02:00">
<meta name="title" content="Tris is tite">
I want to extract only date, so this should be result: 2021-04-28T20:35:00+02:00
I know that I can do it like this:
tag = "meta['Date']"
date = soup.select(tag)
date = date['content']
But is is possible to do that only with one css selector, only with tag value? For example, something like this?
tag = "meta['Date']['content']" # or something like this?
date = soup.select(tag)
print(date)
2021-04-28T20:35:00+02:00
PS
I have to use soup.select
, soup.find(...)
and soup.select_one
does not work for me.
So only soup.select
works!
Upvotes: 1
Views: 239
Reputation: 2086
You can simply use:
>>> from bs4 import BeautifulSoup
>>> content = """<meta name="Date" content="2021-04-28T20:35:00+02:00"> """
>>> soup = BeautifulSoup(content, 'html.parser')
>>> soup.find("meta", {"name": "Date"}).attrs['content']
>>> '2021-04-28T20:35:00+02:00'
If you want to extract all 'meta' tags and display the Date properties using 'select':
>>> for item in soup.select("meta"):
... print(item.attrs.get('content'))
Upvotes: 0
Reputation: 24940
You are almost there. It seems you are confusing the attribute names. Try:
tag = "meta[content]"
date = soup.select_one(tag)
print(date.get('content'))
Output:
2021-04-28T20:35:00+02:00
Edit: Change the tag line to:
tag = "meta[content][name='Date']"
Upvotes: 1