Luke V
Luke V

Reputation: 351

google doc script to capitalize sentences

i am writing the google doc script below to capitalize the sentences in a document.

function cap6() {

   var body = DocumentApp.getActiveDocument().getBody();
   var text = body.editAsText();

   var str1 = text.getText();
   Logger.log(str1);

   // define function "replacement" to change the matched pattern to uppercase
   function replacement(match) { return match.toUpperCase(); }

   // period followed by any number of blank spaces (1,2,3, etc.)
   var reg = /\.(\s*\s)[a-z]/g;

   // capitalize sentence
   var str2 = str1.replace(reg, replacement);
   Logger.log(str2);

   // replace string str1 by string str2
   text.replaceText(str1, str2);

}

the code almost worked in the sense that the correct result is shown in the log file as follows:

[15-10-22 22:37:03:562 EDT] capitalize sentences.  this is one example with ONE blank space after the period.  here is another example with TWO blank spaces after the period.          this is yet another example with MORE THAN THREE blank spaces.

[15-10-22 22:37:03:562 EDT] capitalize sentences.  This is one example with ONE blank space after the period.  Here is another example with TWO blank spaces after the period.          This is yet another example with MORE THAN THREE blank spaces.

the 1st line above is the original paragraph without capitalized sentences; the 2nd line below is the transformed paragraph with capitalized sentences, regardless of the number of blank spaces after the period.

the problem was that i could not replace the original paragraph in the google doc with the transformed paragraph using the code:

   // replace string str1 by string str2
   text.replaceText(str1, str2);

i suspect that i made an error in the arguments of the method "replaceText".

any help to point out my errors would be appreciated. thank you.

Upvotes: 3

Views: 453

Answers (4)

Luke V
Luke V

Reputation: 351

to change the script so that it would run only on the selected text (e.g., a paragraph) to avoid wiping out the existing formatting in other paragraphs in a google doc, i was inspired by the code i found in Class Range.

i also improve on the regular expression in the variable "reg" so that the beginning of a line or paragraph would also be capitalized:

var reg = /(^|\.)(\s*)[a-z]/g;

below is a script that would capitalize the sentences in a selected text (just run the script cap7, which calls the script cap8):

function cap7() {

   // script to capitalize the beginning of a paragraph and the sentences within.
   // highlight a number of paragraphs, then run cap7, which calls cap8.

   // get the selected text inside a google doc
   var selection = DocumentApp.getActiveDocument().getSelection();

   if (selection) {
   var elements = selection.getRangeElements();
     for (var i = 0; i < elements.length; i++) {
       var element = elements[i];

       // Only modify elements that can be edited as text; skip images and other non-text elements.
       if (element.getElement().editAsText) {
         var text = element.getElement().editAsText();

         // capitalize the sentences inside the selected text
         cap8(text);
       }
     }
   }

}

function cap8(text) {

   // define variable str1
   var str1 = text.getText();
   // Logger.log(str1);

   // define function "replacement" to change the matched pattern to uppercase
   function replacement(match) { return match.toUpperCase(); }

   // beginning of a line or period, followed by zero or more blank spaces
   var reg = /(^|\.)(\s*)[a-z]/g;

   // capitalize sentence; replace regular expression "reg" by the output of function "replacement"
   var str2 = str1.replace(reg, replacement);
   // Logger.log(str2);

   // replace whole text by str2
   text.setText(str2); // WORKING

   return text;

}

see also my question in the post google doc script, capitalize sentences without removing other attributes.

Upvotes: 0

Luke V
Luke V

Reputation: 351

i combine the above answers from Washington Guedes and from Robin Gertenbach here that led to the following working script:

function cap6() {

   var body = DocumentApp.getActiveDocument().getBody();
   var text = body.editAsText();

   // define variable str1       
   var str1 = text.getText();

   // define function "replacement" to change the matched pattern to uppercase
   function replacement(match) { return match.toUpperCase(); }

   // period followed by any number of blank spaces (1,2,3, etc.)
   // var reg = /\.(\s*\s)[a-z]/g;
   // or replace \s*\s by \s+
   var reg = /\.(\s+)[a-z]/g;

   // capitalize sentence
   var str2 = str1.replace(reg, replacement);

   // replace the entire text by string str2
   text.setText(str2);

}

on the other hand, the above script would wipe out all existing formatting such as links, boldface, italics, underline in a google doc.

so my next question would be how could i modify the script so it would run on a selected (highlighted) paragraph instead of the whole google doc to avoid the script to wipe out existing formatting.

Upvotes: 1

Robin Gertenbach
Robin Gertenbach

Reputation: 10786

The duplication issue that you have is coming from the line breaks which are not matched by the dot operator in RE2 (Googles Regular expression engine) if you don't include the s flag.
You therefore have a number of matches equal to the number of paragraphs.

You don't need to use a resource intensive replace method though, you can just use text.setText(str2); instead of text.replaceText(".*", str2);

Upvotes: 0

Luke V
Luke V

Reputation: 351

in a flash of inspiration, i ALMOST solved the problem using the following code:

text.replaceText(".*", str2);

my inspiration actually came from reading about the method "replaceText".

the above code worked when i had only ONE paragraph in the google doc.

but when i had two paragraphs in the google doc, then the code gave a duplicate of the document, i.e., a 2nd exact copy of the two paragraphs just below the original two paragraphs (with correct capitalization of the sentences, including the beginning of the 2nd paragraph, but not the beginning of the 1st paragraph).

when i had 3 paragraphs, then i had 3 copies of these 3 paragraphs, such as shown below:

capitalize sentences.  this is one example with ONE blank space after the period.  here is another example with TWO blank spaces after the period.          this is yet another example with MORE THAN THREE blank spaces.

capitalize sentences.  this is one example with ONE blank space after the period.  here is another example with TWO blank spaces after the period.          this is yet another example with MORE THAN THREE blank spaces.

capitalize sentences.  this is one example with ONE blank space after the period.  here is another example with TWO blank spaces after the period.          this is yet another example with MORE THAN THREE blank spaces.

then after running the script, i got 3 copies of these 3 paragraphs (with correct capitalization of the sentences, including the beginning of the 2nd and 3rd paragraphs), as shown below:

capitalize sentences.  This is one example with ONE blank space after the period.  Here is another example with TWO blank spaces after the period.          This is yet another example with MORE THAN THREE blank spaces.

Capitalize sentences.  This is one example with ONE blank space after the period.  Here is another example with TWO blank spaces after the period.          This is yet another example with MORE THAN THREE blank spaces.

Capitalize sentences.  This is one example with ONE blank space after the period.  Here is another example with TWO blank spaces after the period.          This is yet another example with MORE THAN THREE blank spaces.







capitalize sentences.  This is one example with ONE blank space after the period.  Here is another example with TWO blank spaces after the period.          This is yet another example with MORE THAN THREE blank spaces.

Capitalize sentences.  This is one example with ONE blank space after the period.  Here is another example with TWO blank spaces after the period.          This is yet another example with MORE THAN THREE blank spaces.

Capitalize sentences.  This is one example with ONE blank space after the period.  Here is another example with TWO blank spaces after the period.          This is yet another example with MORE THAN THREE blank spaces.







capitalize sentences.  This is one example with ONE blank space after the period.  Here is another example with TWO blank spaces after the period.          This is yet another example with MORE THAN THREE blank spaces.

Capitalize sentences.  This is one example with ONE blank space after the period.  Here is another example with TWO blank spaces after the period.          This is yet another example with MORE THAN THREE blank spaces.

Capitalize sentences.  This is one example with ONE blank space after the period.  Here is another example with TWO blank spaces after the period.          This is yet another example with MORE THAN THREE blank spaces.

so there is still something wrong in the new code... which almost worked if i could get rid of the extra copies of the document.

returning to the original code

text.replaceText(str1, str2);

i suspect that there was something wrong with using the variable "str1" in the 1st argument of method "replaceText". it is hoped some experts could explain the error in my original code.

Upvotes: 1

Related Questions