David542
David542

Reputation: 110257

Efficiently sorting by LOWER in BigQuery

This wonderful blog posts talks about the inefficiencies of using LOWER() in where clauses and more efficient ways to search text. Is there a similar way to do that in the ORDER BY clause? For example:

SELECT * FROM mytable
ORDER BY LOWER(first_name)

What would be the most efficient way to sort by the lowercased-text of the field name, such as would be done in Google Sheets or Excel when sorting by a column?

Upvotes: 1

Views: 229

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173003

RegExp approach as in referenced blog will not work / applicable here (for ORDER BY ) because respective needed functionality is not supported by re2 library which is used for BigQuery regular expression support

Specifically - re2 does not support changing case in regex replace

if it would be supported - you could use something like

REGEXP_REPLACE(first_name, r'.*', r'\L\0') <-- just mockup - does not work with BigQuery

Here \L - converts all subsequent tokens into their respective lowercase equivalences
and \0 (this is supported by BQ) matches entire pattern so the the whole first_name column would be replaced with respective lowercased version

See example here

But - again - \L is not supported by BigQuery / re2

Upvotes: 1

Related Questions