Scottmeup
Scottmeup

Reputation: 162

Regex match between two regex expressions

This has been driving me crazy, I can't find a solution that works! I'm trying to do a regex between a couple of tags, bad idea I've heard but necessary this time :P What I have at the start is a <body class="foo"> where foo can vary between files - <body.*?> search works fine to locate the only copy in each file.

At the end I have a <div id="bar">, bar doesn't change between files.

eg.

<body class="foo">
sometext
some more text
<maybe even some tags>
<div id="bar">

What I need to do is select everything between the two tags but not including them - everything between the closing > on body and the opening < on div - sometext to maybe even some tags.

I've tried a bunch of things, mostly variations on (?<=<body.*>)(.*?)(?=<div id="bar">) but I'm actually getting invalid expressions at worst on notepad++, http://regexpal.com/ and no matches at best.

Any help appreciated!

Upvotes: 0

Views: 80

Answers (2)

hwnd
hwnd

Reputation: 70732

You are attempting to implement variable-length lookbehind in which most regular expression languages and notepad++ does not support. I assume you are using notepad++ so you can use the \K escape sequence.

<body[^>]*>\K.*?(?=<div id="bar">)

The \K escape sequence resets the starting point of the reported match and any previously consumed characters are no longer included. Make sure you have the . matches newline checkbox checked as well.

Alternatively, you can use a capturing group and avoid using lookaround assertions.

<body[^>]*>(.*?)<div id="bar">

Note: Using a capturing group, you can refer to group index "1" to get your match result.

Upvotes: 2

hjpotter92
hjpotter92

Reputation: 80639

Use the following pattern:

/<body[^>]*>(.*?)<div id="bar">/

Upvotes: 1

Related Questions