Hossmeister
Hossmeister

Reputation: 166

Capture text in quotes immediately before keyword

I have an input stream that looks like this:

"ignore this" blah "ignore this" blah "capture this" keyword "ignore this" blah

I want to capture capture this, i.e. the text in quotes before keyword.

I tried the regex (?:\"(.*)\" )(?=keyword), but this captures everything up to the quotation mark before keyword. How would I capture the text in quotes directly before keyword?

Upvotes: 1

Views: 55

Answers (3)

The fourth bird
The fourth bird

Reputation: 163632

The pattern (?:\"(.*)\" )(?=keyword) matches the first " and then matches the last occurrence where a double quote followed by a space is followed by keyword because the dot also matches a double quote.

Note that in the pattern the non capturing group (?: can be omitted and the " does not have to be escaped.

You could use a negated character class instead to match any character except a "

The value is in the first capturing group.

"([^"]+)"(?= keyword)

Explanation

  • " Match literally
  • ( Capturing group
    • [^"]+ Match 1+ times any char except "
  • ) Close group
  • "(?= keyword) Match " and assert what is directly to the right is a space and keyword

Regex demo

An example using Javascript

const regex = /"([^"]+)"(?= keyword)/g;
const str = `"ignore this" blah "ignore this" blah "capture this" keyword "ignore this" blah`;

while ((m = regex.exec(str)) !== null) {
  if (m.index === regex.lastIndex) {
    regex.lastIndex++;
  }
  console.log(m[1]);
}

Upvotes: 2

Muhammad Soliman
Muhammad Soliman

Reputation: 23876

Your string to be captured or returned as a result is in between double quotes followed by a specific keyword. simply find that pattern that matches " followed by anything that is not " then followed by " keyword.

var input = `"ignore this" blah "ignore this" blah "capture this" keyword "ignore this" blah`;
var result = /(?=\")?[^"]+(?=\"\s*keyword)/i.exec(input)
console.log(result);

Upvotes: 0

acesmndr
acesmndr

Reputation: 8525

Try using lookaround assertions

var input = `"ignore this" blah "ignore this" blah "capture this" keyword "ignore this" blah`;
var result = /(?<=\")[A-Za-z0-9\ ]*(?=\" keyword)/i.exec(input)
console.log(result);

Here (?<=\") looks for content that follows " and (?=\" keyword) looks for content that is followed by " keyword.

More about Lookahead and Lookbehind Zero-Length Assertions here: https://www.regular-expressions.info/lookaround.html

Upvotes: 0

Related Questions