kentobento
kentobento

Reputation: 55

Finding and Deleting All Emojis in a Google Spreadsheet

I have a Google Spreadsheet with thousands of cells with each cell being populated with strings with many different emojis.

Example of entries:

"Lol ๐Ÿ˜Š","Haha ๐Ÿ˜Š","Fire ๐Ÿ”ฅ","๐Ÿ‘๐Ÿ‘๐Ÿ‘Awesome!","Nice๐Ÿ‘ See you tomorrow!๐Ÿ˜€", "ใ“ใ‚“ใซใกใฏ๐Ÿ˜Š", "ไฝ ๅฅฝ๐Ÿ˜€"

But I want to delete all of the emojis, is there a search function I can run/piece of Spreadsheet code I can run to make the document devoid of emojis?

Upvotes: 4

Views: 4865

Answers (4)

Max Makhrov
Max Makhrov

Reputation: 18717

Replace emojis from text

I've found, you may use a REGEXREPLACE for that.

To replace all emojis from [A1] please try:

=REGEXREPLACE($A$1,"[๐Ÿป๐Ÿผ๐Ÿฝ๐Ÿพ๐Ÿฟยฉยฎโ€ผโ‰โ„ขโ„นโ†”-โ†™โ†ฉ-โ†ชโŒš-โŒ›โŒจโโฉ-โณโธ-โบโ“‚โ–ช-โ–ซโ–ถโ—€โ—ป-โ—พโ˜€-โ˜„โ˜Žโ˜‘โ˜”-โ˜•โ˜˜โ˜โ˜ โ˜ข-โ˜ฃโ˜ฆโ˜ชโ˜ฎ-โ˜ฏโ˜ธ-โ˜บโ™€โ™‚โ™ˆ-โ™“โ™Ÿ-โ™ โ™ฃโ™ฅ-โ™ฆโ™จโ™ปโ™พ-โ™ฟโš’-โš—โš™โš›-โšœโš -โšกโšงโšช-โšซโšฐ-โšฑโšฝ-โšพโ›„-โ›…โ›ˆโ›Ž-โ›โ›‘โ›“-โ›”โ›ฉ-โ›ชโ›ฐ-โ›ตโ›ท-โ›บโ›ฝโœ‚โœ…โœˆ-โœโœโœ’โœ”โœ–โœโœกโœจโœณ-โœดโ„โ‡โŒโŽโ“-โ•โ—โฃ-โคโž•-โž—โžกโžฐโžฟโคด-โคตโฌ…-โฌ‡โฌ›-โฌœโญโญ•ใ€ฐใ€ฝใŠ—ใŠ™๐Ÿ€„๐Ÿƒ๐Ÿ…ฐ-๐Ÿ…ฑ๐Ÿ…พ-๐Ÿ…ฟ๐Ÿ†Ž๐Ÿ†‘-๐Ÿ†š๐Ÿˆ-๐Ÿˆ‚๐Ÿˆš๐Ÿˆฏ๐Ÿˆฒ-๐Ÿˆบ๐Ÿ‰-๐Ÿ‰‘๐ŸŒ€-๐ŸŒก๐ŸŒค-๐ŸŽ“๐ŸŽ–-๐ŸŽ—๐ŸŽ™-๐ŸŽ›๐ŸŽž-๐Ÿฐ๐Ÿณ-๐Ÿต๐Ÿท-๐Ÿบ๐Ÿ€-๐Ÿ“ฝ๐Ÿ“ฟ-๐Ÿ”ฝ๐Ÿ•‰-๐Ÿ•Ž๐Ÿ•-๐Ÿ•ง๐Ÿ•ฏ-๐Ÿ•ฐ๐Ÿ•ณ-๐Ÿ•บ๐Ÿ–‡๐Ÿ–Š-๐Ÿ–๐Ÿ–๐Ÿ–•-๐Ÿ––๐Ÿ–ค-๐Ÿ–ฅ๐Ÿ–จ๐Ÿ–ฑ-๐Ÿ–ฒ๐Ÿ–ผ๐Ÿ—‚-๐Ÿ—„๐Ÿ—‘-๐Ÿ—“๐Ÿ—œ-๐Ÿ—ž๐Ÿ—ก๐Ÿ—ฃ๐Ÿ—จ๐Ÿ—ฏ๐Ÿ—ณ๐Ÿ—บ-๐Ÿ™๐Ÿš€-๐Ÿ›…๐Ÿ›‹-๐Ÿ›’๐Ÿ›•-๐Ÿ›—๐Ÿ›-๐Ÿ›ฅ๐Ÿ›ฉ๐Ÿ›ซ-๐Ÿ›ฌ๐Ÿ›ฐ๐Ÿ›ณ-๐Ÿ›ผ๐ŸŸ -๐ŸŸซ๐ŸŸฐ๐ŸคŒ-๐Ÿคบ๐Ÿคผ-๐Ÿฅ…๐Ÿฅ‡-๐Ÿงฟ๐Ÿฉฐ-๐Ÿฉด๐Ÿฉธ-๐Ÿฉผ๐Ÿช€-๐Ÿช†๐Ÿช-๐Ÿชฌ๐Ÿชฐ-๐Ÿชบ๐Ÿซ€-๐Ÿซ…๐Ÿซ-๐Ÿซ™๐Ÿซ -๐Ÿซง๐Ÿซฐ-๐Ÿซถ๐Ÿ‡ฆ-๐Ÿ‡ฟ#๏ธโƒฃ*๏ธโƒฃ0๏ธโƒฃ1๏ธโƒฃ2๏ธโƒฃ3๏ธโƒฃ4๏ธโƒฃ5๏ธโƒฃ6๏ธโƒฃ7๏ธโƒฃ8๏ธโƒฃ9๏ธโƒฃ]","")

I believe this regex will find all current emojis from your text.

Notes:

  • some emojis are compound for instance, an astronaut is ๐Ÿง‘๐Ÿผโ€๐Ÿš€. Regex needs to find only solid chars, so all compound emojis will be included.

  • I've tried to shorten the solution, and used actual emojis in RegEx. You may also see more "computer-like" solutions: [\u1F60-\u1F64]|[\u2702-\u27B0].... Those solutions use codes of emojis instead.

  • Another interesting option is given here. Remove all not printable chars: =REGEXREPLACE(A1,"[[:print:]]","")

  • skins are included:

enter image description here

Upvotes: 4

Cooper
Cooper

Reputation: 64100

Cleaning Up with Regular Expressions

I don't have the time to do the whole thing but this will give you a start. I cleaned everything in one cell with this.

var sht = SpreadsheetApp.getActiveSheet();
  var text = sht.getActiveCell().getValue();
  var cleantext = text.replace(/[^\s\w]/g,'');//replace everything that's not whitespace or word characters with null
  sht.getActiveCell().setValue(cleantext);

I used the line you provided as test data. Admittedly it needs a little tweaking because it's getting rid of some punctuation.

This is a little better.

function test()
{
  var sht = SpreadsheetApp.getActiveSheet();
  var text = sht.getActiveCell().getValue();
  var cleantext = text.replace(/[^\s\w"!,]/g,'');//added "!,
  sht.getActiveCell().setValue(cleantext);

}

So as you run it you may want to add a few more characters to don't replace list. That's it.

I have an expense report that I use to collect my expenses in different categories and I like to produce pie charts to help me get a big picture view of where my money is going. I use this Array Formula to help me gather the information into useful categories for me.

=ArrayFormula(IF(Row(C:C)=1,"Title",IF(LEN(C:C),IF(REGEXMATCH(C:C,"(?i)(string1|string2|string3|string4)"),D:D,""),)))

The regular expression provides an or function for adding additional matching for unexpected item appearing on my expense lists that I want to gather into these categories. If you need another matching term you just go into that formula and add another term as shown below

(string1|string2|string3|string4||string5)

The strings are replaced with real terms with no quotes unless they have quotes around them in the search target.

Upvotes: 3

Alan Wells
Alan Wells

Reputation: 31300

Here is some code that goes through one column of data and removes emojis from each cell.

You must replace Your Sheet Tab Name with the sheet tab name that the code should work on. This code currently only processes one column of data. The entire column of values is written back to the sheet in one write operation. Any character codes that are 5 characters or more are assumed to be emojis.

Test it on a few rows of data first.

function killEmojies() {
  var arrayThisRow,columnOfValues,columnToRemoveEmojiesFrom,firstTwoChar,
      i,innerArray,j,L,newCellContent,outerArray,
      ss,sh,
      targetSheet,thisCell,thisCellChar,thisCellVal,thisCharCode,thisCharCodeLength;

  columnToRemoveEmojiesFrom = 1;
  outerArray = [];

  ss = SpreadsheetApp.getActiveSpreadsheet()
  sh = ss.getSheetByName("Your Sheet Tab Name Here");
  targetSheet = ss.getSheetByName("Your Sheet Tab Name Here");

  columnOfValues = sh.getRange(1, columnToRemoveEmojiesFrom,sh.getLastRow(),1).getValues(); 
  L = columnOfValues.length;
  Logger.log('L: ' + L);

  for (i=0;i<L;i++) {

    thisCell = columnOfValues[i];//Get inner array
    thisCellVal = thisCell[0];//Get first element of inner array

    Logger.log(thisCellVal)
    Logger.log('typeof thisCellVal: ' + typeof thisCellVal)

    newCellContent = "";//Reset for every cell
    innerArray = [];//Reset for every row loop

    if (typeof thisCellVal !== 'string') {//This spreadsheet cell contains something
      //other than text
      innerArray.push(thisCellVal);
    } else {
      for (j=0;j<thisCellVal.length;j++) {//Loop through every character in the cell
        thisCellChar = thisCellVal[j];
        thisCharCode = thisCellChar.charCodeAt(0);//Character code of this character
        thisCharCodeLength = thisCharCode.toString().length;

        Logger.log('typeof thisCharCodeLength: ' + typeof thisCharCodeLength);
        Logger.log('this val: ' + thisCharCode);
        Logger.log('thisCharCodeLength: ' + thisCharCodeLength);
        Logger.log(thisCharCodeLength < 5);

        if (thisCharCodeLength === 5) {
          firstTwoChar = thisCharCode.toString().slice(0,2);
          Logger.log('firstTwoChar: ' + firstTwoChar)

        }

        if (thisCharCodeLength > 4 && (firstTwoChar === "54" || firstTwoChar === "55" || firstTwoChar === "56")) {
          continue;//exclude character codes that are 5 or more characters long
          //and start with 54 or 55
        }
        newCellContent = newCellContent + thisCellChar;

      }

      innerArray.push(newCellContent);
    }

    outerArray.push(innerArray);
  }

  targetSheet.getRange(1, columnToRemoveEmojiesFrom,outerArray.length,1).setValues(outerArray);
}

Upvotes: 2

Samer
Samer

Reputation: 63

Assuming all your text strings are single words followed by a space and then an Emoji, you can use the formula

=LEFT(A1,(FIND(" ",A1,1)-1))

This will return the textual contents of a cell only (A1 in this example). If all your data is in a single column, you can just pull down and this will apply to all your data.

Upvotes: 0

Related Questions