Reputation: 32143
I have a problem with emoji in my production database. Since it's in production, all I get out of it is an auto-geneated excel spreadsheet (.xls
) every so often with tens of thousands of rows. I use Google Sheets to parse this so I can easily share the results.
What formula can I use to get a count of all cells in column n that contain emoji?
For instance:
Data
+----+-----------------+
| ID | Name |
+----+-----------------+
| 1 | Chad |
+----+-----------------+
| 2 | ✨Darla✨ |
+----+-----------------+
| 3 | John Smith |
+----+-----------------+
| 4 | Austin ⚠️ Powers |
+----+-----------------+
| 5 | Missus 🎂 |
+----+-----------------+
Totals
+----------------------------------+---+
| People named Chad | 1 |
+----------------------------------+---+
| People with emoji in their names | 3 |
+----------------------------------+---+
Upvotes: 1
Views: 3956
Reputation: 18717
Edit by Ben C. R. Leggiero:
=COUNTA(FILTER(A2:A6;REGEXMATCH(A2:A6;"[^\x{0}-\x{F7}]")))
Upvotes: 2
Reputation: 648
You cannot extract emojis with regular formula because Google Spreadsheet uses the light-weight re2 regex engine, which lacks many features, including those necessary to find emojis.
What you need to do is creating a custom formula. Select Tools menu, then Script editor.... In the script editor, add the following:
function find_emoji(s) {
var re = /[\u1F60-\u1F64]|[\u2702-\u27B0]|[\u1F68-\u1F6C]|[\u1F30-\u1F70]|[\u2600-\u26ff]|[\uD83C-\uDBFF\uDC00-\uDFFF]+/i;
if (s instanceof Array) {
return s.map(function(el){return el.toString().match(re);});
} else {
return s.toString().match(re);
}
}
Save the script. Go back to your spreadsheet, then test your formula =find_emoji(A1)
My test yields the following:
| Missus 🎂 | 🎂 |
| Austin ⚠️ Powers | ⚠ |
| ✨Darla✨ | ✨ |
| joke 😆😆 | 😆😆 |
And, to count entries that don't have emojis, you can use this formula:
=countif( arrayformula(isblank( find_emoji(filter(F2:F,not(isblank(F2:F)))))), FALSE)
EDIT
I was wrong. You can use regular formula to extract emoji. The regex syntax is [\x{1F300}-\x{1F64F}]|[\x{2702}-\x{27B0}]|[\x{1F68}-\x{1F6C}]|[\x{1F30}-\x{1F70}]|[\x{2600}-\x{26ff}]|[\x{D83C}-\x{DBFF}\x{DC00}-\x{DFFF}]
Upvotes: 1
Reputation: 10259
This should work:
=arrayformula(countif(REGEXMATCH(A2:A6,"[^a-zA-Z\d\s:]"),true))
Upvotes: 1