user3611091
user3611091

Reputation: 47

Regex - Match any character across mutliple lines

I had an HTML string that looks like:

<img src="blah blah blah"><p> blah blah
blah blah blah blah blah blah
blah blah blah</p>

How can i read the blah blah... using regex? I tried (.+?) but its not working, and searched google but didnt found a solution for Python.

Thanks!

Upvotes: 1

Views: 59

Answers (3)

Avinash Raj
Avinash Raj

Reputation: 174726

You could try the below code also which uses (?s) DOTALL modifier,

>>> s = """<img src="blah blah blah"><p> blah blah
... blah blah blah blah blah blah
... blah blah blah</p>"""
>>> import re
>>> m = re.search(r'(?s)(?<=<p>).*?(?=<\/p>)', s).group(0)
>>> print m
 blah blah
blah blah blah blah blah blah
blah blah blah

Upvotes: 0

zx81
zx81

Reputation: 41838

With the usual disclaimers about using regex to parse html, this will work:

import re
match = re.search("<img[^>]*><p>([^<]*)</p>", subject)
if match:
    blahblah = match.group(1)
    print blahblah

Explanation

  • <img matches literal chars
  • [^>]* matches any chars that are not >
  • ><p> matches literal chars
  • ([^<]*) captures any chars that are not < to Group 1 (this is what we want)
  • </p> matches literal chars
  • match.group(1) contains our string

Upvotes: 2

jawee
jawee

Reputation: 271

Give you one example for Java:

public static void testRegExp() {
    try {
        String input = "<img src=\"blah blah blah\"><p> blah blah" +
    "\n blah blah blah blah blah blah" +
    "\nblah blah blah</p>";
        Pattern pMod = Pattern.compile("(blah\\s+)+");
        Matcher mMod = pMod.matcher(input);
        int beg = 0;
        while (mMod.find()) {
            System.out.println("--------------");
            System.out.println(mMod.group(0));
        }

    } catch(Exception ex) {
        ex.printStackTrace();
    }
}

The output is :

blah blah

blah blah blah blah blah blah blah blah blah blah

For Python, I guess the regeular expression is similar. Good luck & have a try.

Upvotes: 0

Related Questions