Bowman
Bowman

Reputation: 195

Partial replace in docs what matches only and preserve formatting

Let's assume that we have first paragraph in our google document:

Wo1rd word so2me word he3re last.

We need to search and replace some parts of text but it must be highlighted in editions history just like we changed only that parts and we must not loose our format (bold, italic, color etc).

What i have/understood for that moment: capturing groups didn't work in replaceText() as described in documentation. We can use pure js replace(), but it can be used only for strings. Our google document is array of objects, not strings. So i did a lot of tries and stopped at that code, attached in this message later.

Can't beat: how i can replace only part of what i've found. Capturing groups is very powerful and suitable instrument, but i can't use it for replacement. They didn't work or i can replace whole paragraph, that is unacceptable because of editions history will show full paragraph replace and paragraphs will lose formatting. What if what we searching will be in each and every paragraph, but only one letter must be changed? We will see full document replacement in history and it will be hard to find what really changed.

My first idea was to compare strings, that replace() gives to me with contents of paragraph then compare symbol after symbol and replace what is different, but i understand, that it will work only if we are sure that only one letter changed. But what if replace will delete/add some words, how it can be synced? It will be a lot bigger problem.

All topics that i've found and read triple times didn't helped and didn't moved me from the dead point.

So, is there any ideas how to beat that problem?

function RegExp_test() {
  var docParagraphs = DocumentApp.getActiveDocument().getBody().getParagraphs();
  var i = 0, text0, text1, test1, re, rt, count;

  // equivalent of .asText() ???
  text0 = docParagraphs[i].editAsText();  // obj
  // equivalent of .editAsText().getText(), .asText().getText()
  text1 = docParagraphs[i].getText();     // str

  if (text1 !== '') {
    re = new RegExp(/(?:([Ww]o)\d(rd))|(?:([Ss]o)\d(me))|(?:([Hh]e)\d(re))/g);  // v1
//    re = new RegExp(/(?:([Ww]o)\d(rd))/);         // v2

    count = (text1.match(re) || []).length;       // re v1: 7, re v2: 3

    if (count) {
      test1 = text1.match(re);   // v1: ["Wo1rd", "Wo", "rd", , , , , ]
//      for (var j = 0; j < count; j++) {
//        test1 = text1.match(re)[j];
//      }

      text0.replaceText("(?:([Ww]o)\\d(rd))", '\1-A-\2');   // GAS func
      // #1: \1, \2 etc - didn't work: " -A- word so2me word he3re last."
      test1 = text0.getText();

      // js func, text2 OK: "Wo1rd word so-B-me word he3re last.", just in memory now
      text1 = text1.replace(/(?:([Ss]o)\d(me))/, '$1-B-$2'); // working with str, not obj
      // rt OK: "Wo1rd word so-B-me word he-C-re last."
      rt = text1.replace(/(?:([Hh]e)\d(re))/, '$1-C-$2');

      // #2: we used capturing groups ok, but replaced whole line and lost all formatting
      text0.replaceText(".*", rt);
      test1 = text0.getText();
    }
  }
  Logger.log('Test finished')
}

Upvotes: 0

Views: 320

Answers (1)

Bowman
Bowman

Reputation: 195

Found a solution. It's a primitive enough but it can be a base for a more complex procedure that can fix all occurrences of capture groups, detect them, mix them etc. If someone wants to improve that - you are welcome!

function replaceTextCG(text0, re, to) {
  var res, pos_f, pos_l;
  var matches = text0.getText().match(re);
  var count = (matches || []).length;

  to = to.replace(/(\$\d+)/g, ',$1,').replace(/^,/, '').replace(/,$/, '').split(",");
  for (var i = 0; i < count; i++) {
    res = re.exec(text0.getText())
    for (var j = 1; j < res.length - 1; j++) {
      pos_f = res.index + res[j].length;
      pos_l = re.lastIndex - res[j + 1].length - 1;
      text0.deleteText(pos_f, pos_l);
      text0.insertText(pos_f, to[1]);
    }
  }
  return count;
}

function RegExp_test() {
  var docParagraphs = DocumentApp.getActiveDocument().getBody().getParagraphs();
  var i = 0, text0, count;

  // equivalent of .asText() ???
  text0 = docParagraphs[i].editAsText();  // obj
  if (text0.getText() !== '') {
    count = replaceTextCG(text0, /(?:([Ww]o)\d(rd))/g, '$1A$2');
    count = replaceTextCG(text0, /(?:([Ss]o)\d(me))/g, '$1B$2');
    count = replaceTextCG(text0, /(?:([Hh]e)\d(re))/g, '$1C$2');
  }
  Logger.log('Test finished')
}

Upvotes: 1

Related Questions