Reputation: 2735
I need to parse the Device Time (i.e. 2012-01-17 13:12:09) in below text by using python. Could you please tell me how I can do this using the standard regular expression library in python? Thanks.
<html><head><style type="text/css">h1 {color:blue;}h2 {color:red;}</style>
<h1>Device #1 Root Content</h1><h2>Device Addr: 127.0.0.1:8080</h1>
<h2>Device Time: 2012-01-17 13:12:09</h2></body></html>
Upvotes: 0
Views: 1004
Reputation: 57650
You need this regex.
/Device Time: (\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})/
or this,
/Device Time: (\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d)/
Use this regular expression with global switch on.
Upvotes: 1
Reputation: 6277
Just to add
import re
pattern = re.compile(r'(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})')
first_match = pattern.search(html)
Upvotes: 2
Reputation: 4564
Try this regex
Device Time: ([^<]+)
this will just return the remaining rest after the words "Device Time: " till the next html tag starts. As shown in an other answer you could also search for a more specific format of this date time.
In general it's considered bad practice to parse html files with regex. However you're example is more like parsing some normal text which happens to be part of html file... In this case that's kind of fine... ;-)
Upvotes: 1
Reputation: 229
Maybe like this: import re
str = """ Your HTML String here"""
pattern = re.compile(r"""Device Time:([ \d\-:]*)""")
s = pattern.search(str)
time = s.group(1)
Upvotes: 1