EnGlamdring
EnGlamdring

Reputation: 1

Why does this regex used in findText gobble up the entire text as if it is greedy?

I can't for the life of me figure out why this regex is gobbling up the whole line in Google Docs. When I run this I can't get it to return just {{ClientName}}

Here is my text from my document.

{{ClientName}} would like to have a {{Product}} {{done/created}}. The purpose of this {{Product}} is to {{ProductPurpose}}. We have experience with such testing and development, and will develop and test the {{Product}} for {{ClientName}}.

function searchAndFind () {
     var foundText = DocumentApp.getActiveDocument().getBody().findText('\{\{([^,\s}{][a-zA-Z]+)\}\}').getElement().asText().getText()
     return foundText
}

Upvotes: 0

Views: 297

Answers (3)

TheAddonDepot
TheAddonDepot

Reputation: 8974

Regex is 'greedy' by default. You can make a quantifier (ie. +,?,* or {}) non-greedy by following the quantifier with ?.

For example:

  • x??
  • x*?
  • x+?
  • x{n}?
  • x{n,}?
  • x{n,m}?

Modify your regex to leverage this feature.

Check out the regex documentation on MDN and do a search (CTRL+F in chrome) for the term 'greedy' for more information.

Upvotes: 0

Cooper
Cooper

Reputation: 64100

Try this:

function searchAndFind () {
  var foundElement = DocumentApp.getActiveDocument().getBody().findText('\{\{([^,\s}{][a-zA-Z]+)\}\}').getElement().asText().getText();
  var start=DocumentApp.getActiveDocument().getBody().findText('\{\{([^,\s}{][a-zA-Z]+)\}\}').getStartOffset();
  var end=DocumentApp.getActiveDocument().getBody().findText('\{\{([^,\s}{][a-zA-Z]+)\}\}').getEndOffsetInclusive();
  var foundText=foundElement.slice(start,end+1);
  Logger.log('\nfoundElement: %s\nstart: %s\nend: %s\nfoundText:%s\n',foundElement,start,end,foundText);
  return foundText;

Logger.log output:

[18-12-11 13:04:34:863 MST] 
foundElement: {{ClientName}} would like to have a {{Product}} {{done/created}}. The purpose of this {{Product}} is to {{ProductPurpose}}. We have experience with such testing and development, and will develop and test the {{Product}} for {{ClientName}}.
start: 0.0
end: 13.0
foundText:{{ClientName}}

Upvotes: 0

TheMaster
TheMaster

Reputation: 50764

Issue:

This is because findText() returns a RangeElement object, which provides methods for getting the full text Element as well as the offset of the actual matched text in the Element. When you use getElement(), you get the whole element instead of just the matched string.

Solution:

Get offsets from the range element to get the actual text in the element.

Code Snippet:

 function searchAndFind() {
  var rangeElement = DocumentApp.getActiveDocument()
    .getBody()
    .findText('{{([^,\\s]+)}}');

  return rangeElement
    .getElement()
    .asText()
    .getText()
    .substring(
      rangeElement.getStartOffset(),
      rangeElement.getEndOffsetInclusive()+1
    );
}

References:

Upvotes: 1

Related Questions