F. Aydemir
F. Aydemir

Reputation: 2735

Python - regex matching in HTML Body

I need to parse the Device Time (i.e. 2012-01-17 13:12:09) in below text by using python. Could you please tell me how I can do this using the standard regular expression library in python? Thanks.

  <html><head><style type="text/css">h1 {color:blue;}h2 {color:red;}</style>
  <h1>Device #1   Root Content</h1><h2>Device Addr: 127.0.0.1:8080</h1>
  <h2>Device Time: 2012-01-17 13:12:09</h2></body></html>

Upvotes: 0

Views: 1004

Answers (4)

Shiplu Mokaddim
Shiplu Mokaddim

Reputation: 57650

You need this regex.

/Device Time: (\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})/

or this,

/Device Time: (\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d)/

Use this regular expression with global switch on.

Upvotes: 1

Shadow
Shadow

Reputation: 6277

Just to add

import re
pattern = re.compile(r'(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})')
first_match = pattern.search(html)

Upvotes: 2

bw_&#252;ezi
bw_&#252;ezi

Reputation: 4564

Try this regex

Device Time: ([^<]+)

this will just return the remaining rest after the words "Device Time: " till the next html tag starts. As shown in an other answer you could also search for a more specific format of this date time.

In general it's considered bad practice to parse html files with regex. However you're example is more like parsing some normal text which happens to be part of html file... In this case that's kind of fine... ;-)

Upvotes: 1

xueyumusic
xueyumusic

Reputation: 229

Maybe like this: import re

str = """ Your HTML String here"""

pattern = re.compile(r"""Device Time:([ \d\-:]*)""")
s = pattern.search(str)

time = s.group(1)

Upvotes: 1

Related Questions