Ky -
Ky -

Reputation: 32143

How do I count all cells in a column that have emoji?

I have a problem with emoji in my production database. Since it's in production, all I get out of it is an auto-geneated excel spreadsheet (.xls) every so often with tens of thousands of rows. I use Google Sheets to parse this so I can easily share the results.

What formula can I use to get a count of all cells in column n that contain emoji?

For instance:

Data
+----+-----------------+
| ID | Name            |
+----+-----------------+
| 1  | Chad            |
+----+-----------------+
| 2  | ✨Darla✨         |
+----+-----------------+
| 3  | John Smith      |
+----+-----------------+
| 4  | Austin ⚠️ Powers |
+----+-----------------+
| 5  | Missus 🎂        |
+----+-----------------+

Totals
+----------------------------------+---+
| People named Chad                | 1 |
+----------------------------------+---+
| People with emoji in their names | 3 |
+----------------------------------+---+

Upvotes: 1

Views: 3956

Answers (3)

Max Makhrov
Max Makhrov

Reputation: 18717

Edit by Ben C. R. Leggiero:

=COUNTA(FILTER(A2:A6;REGEXMATCH(A2:A6;"[^\x{0}-\x{F7}]")))

Upvotes: 2

daniel
daniel

Reputation: 648

You cannot extract emojis with regular formula because Google Spreadsheet uses the light-weight re2 regex engine, which lacks many features, including those necessary to find emojis.

What you need to do is creating a custom formula. Select Tools menu, then Script editor.... In the script editor, add the following:

function find_emoji(s) {
  var re = /[\u1F60-\u1F64]|[\u2702-\u27B0]|[\u1F68-\u1F6C]|[\u1F30-\u1F70]|[\u2600-\u26ff]|[\uD83C-\uDBFF\uDC00-\uDFFF]+/i;
  if (s instanceof Array) {
    return s.map(function(el){return el.toString().match(re);});
  } else {
    return s.toString().match(re);
  }
}

Save the script. Go back to your spreadsheet, then test your formula =find_emoji(A1)

My test yields the following:

| Missus 🎂        | 🎂   |
| Austin ⚠️ Powers | ⚠   |
| ✨Darla✨        | ✨   |
| joke 😆😆        | 😆😆 |

And, to count entries that don't have emojis, you can use this formula:

=countif( arrayformula(isblank( find_emoji(filter(F2:F,not(isblank(F2:F)))))), FALSE)

EDIT

I was wrong. You can use regular formula to extract emoji. The regex syntax is [\x{1F300}-\x{1F64F}]|[\x{2702}-\x{27B0}]|[\x{1F68}-\x{1F6C}]|[\x{1F30}-\x{1F70}]|[\x{2600}-\x{26ff}]|[\x{D83C}-\x{DBFF}\x{DC00}-\x{DFFF}]

Upvotes: 1

Ed Nelson
Ed Nelson

Reputation: 10259

This should work:

=arrayformula(countif(REGEXMATCH(A2:A6,"[^a-zA-Z\d\s:]"),true))

Upvotes: 1

Related Questions